Robust Speech Recognition and its ROBOT implementation

Size: px
Start display at page:

Download "Robust Speech Recognition and its ROBOT implementation"

Transcription

1 Robust Speech Recognition and its ROBOT implementation Yoshikazu Miyanaga Hokkaido University

2 Conditions for Speech Recognition Short Isolated Speech: words, phrase (<2sec) Attached Mic (several cm 10cm) Continuous Speech: sentences (>2sec) Remote Mic: (10cm 5m) Silent Room (>20dB) Living Room(20 ~10dB) Long Distance Mic: (>5m) Noisy Room: exhibition(<10db)

3 Conventional ASR Continuous Speech: (>2sec) Attached Mic (<10cm) Silent Room (>20dB) Short Isolated Speech: (<2sec) Attached Mic (<10cm) Living Room(20 ~10dB) Array Microphone Short Isolated Speech: (<2sec) Attached Remote Mic: (<5m) Living Room(20 ~10dB)

4 Hokkaido University Speech Communication System (HU-SCS) Short Isolated Speech: words, phrase (<2sec) Long Distance Mic: (>5m) Attached Mic (several cm 10cm) Remote Mic: (10cm 5m) Noisy Room: exhibition(<10db) Silent Room (>20dB) Living Room(20 ~10dB)

5 HU-SCS Automatic Speech Detection

6 97% by Current Technology (SNR 10dB) HU-SCS WAVELET Non-Linear Processing Robust voice activity detection using perceptual wavelet-packet transform and teager energy operator S-H Chen, H-T Wu, Y. Chang and T.K. Truong, Trans. Pattern Recognition Letters (2007) Automatic Speech Detection

7 HU-SCS HU-SCS v4 99% over SNR 10dB BP+Threshold Ope F 0 Detection Automatic Speech Detection

8 HU-SCS Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?

9 71% by Current Tech (SNR 10dB). 97.4% (SNR 20dB). Spectral Subtraction RASTA, CMS A Prior Information HU-SCS Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?

10 HU-SCS HU-SCS v4 95.3% (SNR 10dB). 98.3% (20dB) No A Prior Info. RSF/DRA Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?

11 HU-SCS Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? Automatic Speech Rejection Recognition Result Good Morning

12 90% by Current Tech Confidential Scoring HU-SCS Technique Recognition confidential scoring and its use in speech understanding systems, T.J. Hazen, S.Seneff and J.Polifroni, Trans on Computer Speech and language (2002). Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? Automatic Speech Rejection Recognition Result Good Morning

13 HU-SCS Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? HU-SCS v4 Dependent GMM by Weighted HMM (90% Accuracy) AI (Artificial Intelligence) Automatic Speech Rejection Recognition Result Good Morning

14 HU-SCS First SCS HW LSI IP Mobile Intelligent Consumer Electronics etc Fine Advantage Automatic Automatic HW with Speech Speech Low Detection Power Rejection Recognition (1) Mobile Appli Small Low Power (2) PC free Super Low-Power Consumption Design Real-Time SCS 180nsec/word (10MHz クロック ) Recognition Time Small Scale Design with Special Designed LSI Noise Reduction by Array Microphone

15 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection

16 Running Spectrum Domain Waveform Mel-Spectra t t-6

17 BP and Threshold OP End Point Start Point

18 Detection Switch-Less Recognition System by Automatic Detection Speech Recognition Operation/Control Recognition Hands Free Operation/Control 無音区間 Start Recognition Start End 無音区間 Recognition End Recognition Operation/Control

19 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection

20 Speech Analysis and Robust Processing Speech Analysis LPC Cepstrum Mel-Frequency Cepstrum Robust Processing Various types of techniques have been proposed. Spectral Subtraction Wiener Filtering Microphone Arrays RSF/DRA (Running Spectrum Filtering/Dynamic Range Adjustment) uses filtering and normalizing for cepstral vectors.

21 Procedure of Mel-Frequency Cepstrum Speech Signals x(t) Cut into Short-Time Frames x f (n,t s ) Discrete Fourier Transform (DFT) Filterbanks with Mel-Frequency Scale Logarithm X(n,f) X s (n,f m ) log(x s (n,f m )) Discrete Cosine Transform (DCT) C(n,k) Cepstral Coefficients n : frame index k : cepstral order

22 Noise modeling Spectrum including noise can be modeled as, X ( n, ) S( n, ) H( ) A( ) Clean spectrum Multiplicative noise Additive noise log E( n, ) S( n, ) H( ) log X ( n, ) log( E( n, ) A( ))

23 Noise Corruption in Power Spectrum Noise corruptions make differences on gains and DC components. Power Spectrum Clean Speech E(n,ω)+A E(n,ω) Noisy Speech (White Noise at 10dB SNR)

24 Noise Corruption in Log Power Spectrum Noise corruptions make differences on gains and DC components. Log-power Spectrum E(n,ω)+A Clean Speech DC Components E(n,ω) Gain Noisy Speech (White Noise at 10dB SNR)

25 Running Spectrum Running spectrum is obtained by accumulating short-time spectrum DFT Running spectrum: time trajectory of frequency Frequency Frame Number

26 Spectral Subtraction Running spectrum of a noisy speech (white noise at 5 db SNR) After Subtraction Estimate the spectrum of noise from short-time spectra in the first several flames Subtract the estimated spectrum from each short-time spectrum

27 Noise Reduction Techniques Conventional method Spectral subtraction Parameters are not optimized for speeches from various environments. Excessive subtraction may cause musical noise. Robust speech feature extraction. Advanced speech analysis using RSF (running spectral filtering) and DRA (dynamic range adjustment).

28 Modulation Spectrum RSF focuses on modulation spectrum Running Spectrum Modulation spectrum: spectrum versus time trajectory of frequency. Modulation Spectrum Frequency Frame Number DFT on each frequency Frequency Modulation frequency

29 Mod-F of Clean and Noisy Speech Speech components are dominant around 4 Hz in modulation spectrum. Clean Noisy (white noise at 5 db SNR) Lower modulation frequency components can be assumed as noise because of little changes in noise components.

30 Frequency (Hz) RSF (Running Spectrum Filtering) Speech components are dominant around 4 Hz in modulation spectrum. Modulation Spectrum Noise Components Speech Components Modulation Frequency [Hz] Unnecessary Part

31 RSF RSF (Running Spectrum Filtering) enhances perceptual auditory components. decreases noise components relatively by bandpass filtering in cepstral sequences. ~ C( n, k) Modulation Frequency of RSF Q h( i) C( n i 0 i, k) Coefficients in FIR Filter RASTA(IIR) RSF

32 DRA DRA (Dynamic Range Adjustment) normalizes amplitude of cepstral vectors in time domain (use of maximum value during utterance). suppresses dynamic range distortions caused by additive noise. C ( n, k) k ~ C( n, k k) ~ max C( n, k) 1 k T

33 RSF / DRA Comparison in cepstral time-trajectories at 4th order Clean Noisy Baseline RSF/DRA processing

34 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection

35 Likelihoods of HMM Average HMM Variance GMM GMM GMM GMM GMM Approximation of many multidimensional Gaussian Distribution

36 Evaluation on Likelihoods MFCC p 1 p 2 p p 3 4 p 5 p 6 p 7 p 8 p 9 p p Likelihood of MFCC into this HMM The maximum likelihood is selected and its label is recognized as the result. The result is correct, isn t it?

37 Likelihood Likelihood Evaluation of Reliability The result of the top score is trusted. The result of the top score is NOT trusted.

38 Rejection Method using Multi-Criterions Tendency Ratio Maximum Score Square of Ratio MFCC Cluster Group Evaluation of Cluster New Type Speech Rejection Noisy Conditions

39 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection

40 Overview of ASR System Current ASR systems adopt robust processing that removes influences of noise distortions. Speech Feature Vectors Calculate Probability (likelihood Covert to Spectrum or Cepstrum scores) Speech Data Speech Analysis Robust Processing Decrease Noise Distortions Speech Recognition Results Reference Models Prepare Reference Patterns by Speech Training

41 Circuit Structure of Complete Recognition System Speech Signal Robust Processing SRAM Speech Recognition Data Control System Control External Memory (SRAM) from/to Processor Speech Analysis SRAM SRAM

42 Circuit Implementations Required Operating Performance Speech Analysis 10 MIPS Robust Processing 500 MIPS (mainly in FIR) FFT IDCT FIR Log Divider 8 Buffer Buffer Cos/Sin ROM Speech Data 256*16 bits ROM RAM 512*24 bits Speech Analysis (MFCC) 4096*16 bits RAM 256*24 bits Robust Processing (RSF/DRA) Feature Vectors

43 Block Diagram Interfaces Microprocessor, External RAM, and Master/Slave MPU Interface Master Bus Interrupt Signal Filter Coefficients for RSF CLK SW Bus Control RSF/DRA SRAM HMM System Control RESET SRAM interface Address MFCC SRAM SRAM Data Control Chip Select Slave Bus Working for MFCC and RSF Data Control All right reserved. Copyright Feature Yoshikazu parameters Miyanaga before speech detection

44 New Scalable Architectures 2 types of scalable techniques are applied to the system. (1) Multiple Process Elements (PEs) in HMM Circuit The PEs enable high-speed processing and improving recognition performance. (2) Master/Slave Operation in the Complete System The operation enables high-speed processing and increase the number of word vocabularies.

45 HMM (Hidden Markov Models) qn Hidden Markov Models (HMM) Statistical modeling approach using Markov chain. Powerful for expressing time-varying data sequences and robust with speaker differences. ( 1 n N ) Set of states a 33 a 44 a 12 a23 a 34 a45 q1 q q 2 3 q 4 a11 a22 aij State transition probability b ( ) b ( ) b N (k) 1 k 2 k Output probability

46 Full-Parallel Computations in HMM The output probabilities and temporal scores can be computed concurrently for the number of HMM states. Output Prob. Calc. Score Calc. o t Output Prob. Calc. Output Prob. Calc. Score Calc. Score Calc. Select Max Max(δ) Path for upper state Output Prob. Calc. Score Calc.

47 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM

48 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [4] RAM (3) Broadcast Slave1 [3] (4) Speech Recognition Slave2 [2] (5) Gather Results Slave3 [1]

49 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM

50 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [1] RAM (3) Broadcast Slave1 (4) Speech Recognition (5) Gather Results Slave2 Slave3 [2]

51 Master/Slave Operation(2) (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM

52 Master/Slave Operation(2) (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [4] RAM (3) Broadcast Slave1 [3] (4) Speech Recognition Slave2 [2] (5) Gather Results Slave3 [1]

53 Circuit Design (Analysis & HMM TEG) Technology Rohm CMOS 0.35 μm Univ. of Tokyo EXD Standard Cell Library Voltage Supply 3.3V RTL Level Design.Verilog-HDL Evaluation V2 Layout View Clock Freq. (MHz) Proc Time (ms/word) Power Coms (mw)

54 Comparison on Power Consumption Proposed HW (10MHz) and DSP Design (80MIPS) Processor Structure DSP based System TMS320C549 80MIPS Proposed System Dedicated Processor 10MHz Memory Access Time (ns) Processor (mw) (Core : 3.3V) Memory (mw) (SRAM, Core : 3.3V) Total

55 Processing Time of HU-SCS Comparison with Software Design 54 times faster No high speed clock Useful for Low-Power Design Proposed System (Hardware) Pentium 4 (Software) No. arithmetic units No. cycles 455,200 - Frequency(MHz) Recognition Processing time(ms)

56 Design by Standard Cells TSMC0.25µm CMOS Standard Cell Voltage 2.5V Highest Clock Rate 80.6MHz (12.4ns, Temperature Cond. Typical) No. Parallel Processing 32 8 HMM 491, ,980 RSF/DRA 11,910 MFCC 39,670 System Control 18,310 Bus Control 1,310 SRAM 63,400 Total 626, ,580

57 Current HU-SCS PC Interface with HU-SCS Board HU-SCS Board 55mm 44 mm

58 Overview of Current HU-SCS Improvement of Noise Robust Accurate ASR under SNR 0-10dB Robustness against Echo Improvement of Speech Recognition Higher Accuracy on MFCC Calculation Low Power Design and Higher Speed Processing Improvement of Total HW System Higher Speed Response Time

59 Comparison on Performance Environment Noise Level Correctness Current Previous Meeting Room 50dB 96.4% 90.0% Elevator 50dB 95.0% 84.4% Stairs 45dB 85.1% 50.5% Car A(Idling, No-Moving) 50dB 99.4% 95.6% Car B(High Speed, Open Window) 75dB 93.3% 85.0% Car C(High Speed, Audio ON(FM)) 75dB 88.9% 65.6% Total 93.0% 78.5% Cruiser Board(Outside, high speed) 80dB 82.7% - Comparisons between HU-SCS v4 and v % 50.00% 0.00% Previous Current

60 Results on Some Distances Car A Car B Car C 100.0% 100.0% 100.0% 90.0% 90.0% 90.0% 80.0% 80.0% 70.0% 80.0% 70.0% 70.0% 60.0% 50.0% 40.0% 60.0% 30cm 60cm 90cm 60.0% 30cm 60cm 90cm 30.0% 30cm 60cm 90cm Meeting Room Elevator Stair 100.0% 100.0% 100.0% 90.0% 90.0% 90.0% 80.0% 80.0% 70.0% 80.0% 70.0% 70.0% 60.0% 50.0% 40.0% 60.0% 30cm 60cm 90cm 60.0% 30cm 60cm 90cm 30.0% 30cm 60cm 90cm

61 Robot Implementation Speech Recognition & Synthesis Quick Response Control to Consumer Electronics and Machines

62 Communications and Controls

63 Summary Hokkaido University Speech Communication System Integrated Architecture of Speech Detection, Robust Speech Analysis, Speech Recognition, Speech Rejection Higher Speed Processing than DSP and Software Superior in Energy Saving than DSP Solutions Improving Noise Robustness by RSF/DRA Technique Small, Fast and Low Power

64 Who? Yoshikazu Miyanaga He received the B.S., M.S., and Dr. Eng. degrees from Hokkaido University, Sapporo, Japan, in 1979, 1981, and 1986, respectively. He is currently a Professor at Graduate School of Information Science and Technology, Hokkaido University. His research interests are in the areas of signal processing for wireless communications, nonlinear signal processing and low-power LSI systems. He was a chair of Technical Group on Smart Info-Media System, IEICE. He is an advisory member of this technical group. Currently, he is IEICE fellow. He served as a member in the board of directors, IEEE Japan Council as a chair of student activity committee from 2002 to He is a chair of student activity committee in IEEE Sapporo Section from He is a chair of IEEE Circuits and Systems Society, Digital Signal Processing Technical Committee from He has been serving as international steering committee chairs/members of IEEE ISPACS, IEEE ISCIT, IEEE/EURASIP NSIP and honorary/general chairs/co-chairs of their international symposiums/workshops, i.e., ISPACS 2003, ISCIT 2004, ISCIT 2005, NSIP 2005, ISPACS 2008, ISMAC 2009 and APSIPA ASC He also served as international organizing committee chairs of IEICE ITC-CSCC , IEEE MSCAS 2004, IEEE ISCAS

65 Current References of this Topic 1. Kazunaga Ohnuki, Wataru Takahashi, Shingo Yoshizawa, Yoshikazu Miyanaga, Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis, Proceedings of 2008 International Symposium on Communications and Information Technologies (ISCIT), pp , October Jirabhorn Chaiwongsai, Werapon Chiracharit, Kosin Chamnongthai, Yoshikazu Miyanaga, An Architecture of HMM-Based Isolated-Word Speech Recognition with Tone Detection Function, Proceedings of 2008 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), December Nongnuch Suktangman, Kham Khanthavivone, Kraisin Songwatana, Yoshikazu Miyanaga, Robust Speech Recognition Based on Speech Spectrum on Bark Scale, EURASIP Proceedings of 2007 International Workshop on Nonlinear Signal and Image Processing (NSIP), pp , September Shingo Yoshizawa, Naoya Wada, Noboru Hayasaka, Yoshikazu Miyanaga, "Scalable Architecture for Word HMM-Based Speech Recognition and VLSI Implementation in Complete System", IEEE Transactions on Circuits and Systems I, Vol.53, No.1, pp.70-77, January Noboru Hayasaka and Yoshikazu Miyanaga, Spectrum Filtering with FRM for Robust Speech Recognition, IEEE Proceedings of International Symposium on Circuits and Systems (ISCAS), No.2, pp , May Naoya Wada, Noboru Hayasaka, Shingo Yoshizawa, Yoshikazu Miyanaga, Direct Control on Modulation Spectrum for Noise-Robust Speech Recognition and Spectral Subtraction, IEEE International Symposium on Circuits and Systems (ISCAS), pp , May Shingo Yoshizawa, Noboru Hayasaka, Naoya Wada, Yoshikazu Miyanaga, VLSI Architecture for Robust Speech Recognition Systems and its Implementation in Verification Platform, Journal of Robotics and Mechatronics, Vol.17, No.4, pp , Aug Yasuyuki Hatakawa, Shingo Yoshizawa, Yoshikazu Miyanaga, Robust VLSI Architecture for System-On-Chip Design and its implementation in Viterbi Decoder, IEEE International Symposium on Circuits and Systems (ISCAS), Vol.3, pp.25-28, May K.Songwatana, K. Dejhan, Y. Miyanaga and K. Khanthavivone, A Vowels Recognition Model for Laotion language using Transfer Function on Bark scale and Hidden Markov Modeling, IEEE Proceedings of International Workshop on Nonlinear Signal and Image Processing (NSIP), Vol.1, pp , May Kazuma Fujioka,Noboru Hayasaka,Yoshikazu Miyanaga and Norinobu Yoshida, A Noise Reduction Method of Speech Signals Using Running Spectrum Filtering, IEICE Transactions on Information and Systems Part.2,Vol.J88-D-Ⅱ, No.4,pp ,April Qi Zhu, Noriyuki Ohtsuki, Yoshikazu Miyanaga and Norinobu Yoshida, Noise-Robust Speech Analysis Using Running Spectrum Filtering, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Science, Vol.E-88-A, No.2, pp , February

A Real Time Noise-Robust Speech Recognition System

A Real Time Noise-Robust Speech Recognition System A Real Time Noise-Robust Speech Recognition System 7 A Real Time Noise-Robust Speech Recognition System Naoya Wada, Shingo Yoshizawa, and Yoshikazu Miyanaga, Non-members ABSTRACT This paper introduces

More information

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping 100 ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005 Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping Naoya Wada, Shingo Yoshizawa, Noboru

More information

Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques

Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques 81 Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Noboru Hayasaka 1, Non-member ABSTRACT

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute

More information

Speech Recognition on Robot Controller

Speech Recognition on Robot Controller Speech Recognition on Robot Controller Implemented on FPGA Phan Dinh Duy, Vu Duc Lung, Nguyen Quang Duy Trang, and Nguyen Cong Toan University of Information Technology, National University Ho Chi Minh

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

SYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE

SYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE SYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE Zhizheng Wu 1,2, Xiong Xiao 2, Eng Siong Chng 1,2, Haizhou Li 1,2,3 1 School of Computer Engineering, Nanyang Technological University (NTU),

More information

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991 RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b R E S E A R C H R E P O R T I D I A P On Factorizing Spectral Dynamics for Robust Speech Recognition a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-33 June 23 Iain McCowan a Hemant Misra a,b to appear in

More information

Automatic Morse Code Recognition Under Low SNR

Automatic Morse Code Recognition Under Low SNR 2nd International Conference on Mechanical, Electronic, Control and Automation Engineering (MECAE 2018) Automatic Morse Code Recognition Under Low SNR Xianyu Wanga, Qi Zhaob, Cheng Mac, * and Jianping

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Robust telephone speech recognition based on channel compensation

Robust telephone speech recognition based on channel compensation Pattern Recognition 32 (1999) 1061}1067 Robust telephone speech recognition based on channel compensation Jiqing Han*, Wen Gao Department of Computer Science and Engineering, Harbin Institute of Technology,

More information

Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System

Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System C.GANESH BABU 1, Dr.P..T.VANATHI 2 R.RAMACHANDRAN 3, M.SENTHIL RAJAA 3, R.VENGATESH 3 1 Research Scholar (PSGCT)

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and

More information

Comparison of Spectral Analysis Methods for Automatic Speech Recognition

Comparison of Spectral Analysis Methods for Automatic Speech Recognition INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Power Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition

Power Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition Power Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition Chanwoo Kim 1 and Richard M. Stern Department of Electrical and Computer Engineering and Language Technologies

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

I D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b

I D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b R E S E A R C H R E P O R T I D I A P Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-47 September 23 Iain McCowan a Hemant Misra a,b to appear

More information

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a R E S E A R C H R E P O R T I D I A P Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a IDIAP RR 7-7 January 8 submitted for publication a IDIAP Research Institute,

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

An Improved Voice Activity Detection Based on Deep Belief Networks

An Improved Voice Activity Detection Based on Deep Belief Networks e-issn 2455 1392 Volume 2 Issue 4, April 2016 pp. 676-683 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com An Improved Voice Activity Detection Based on Deep Belief Networks Shabeeba T. K.

More information

International Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015

International Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015 RESEARCH ARTICLE OPEN ACCESS A Comparative Study on Feature Extraction Technique for Isolated Word Speech Recognition Easwari.N 1, Ponmuthuramalingam.P 2 1,2 (PG & Research Department of Computer Science,

More information

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department

More information

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic

More information

Fundamental frequency estimation of speech signals using MUSIC algorithm

Fundamental frequency estimation of speech signals using MUSIC algorithm Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

A Novel Speech Controller for Radio Amateurs with a Vision Impairment

A Novel Speech Controller for Radio Amateurs with a Vision Impairment IEEE TRANSACTIONS ON REHABILITATION ENGINEERING, VOL. 8, NO. 1, MARCH 2000 89 A Novel Speech Controller for Radio Amateurs with a Vision Impairment Chih-Lung Lin, Bo-Ren Bai, Li-Chun Du, Cheng-Tao Hu,

More information

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to

More information

Time-Frequency Distributions for Automatic Speech Recognition

Time-Frequency Distributions for Automatic Speech Recognition 196 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 3, MARCH 2001 Time-Frequency Distributions for Automatic Speech Recognition Alexandros Potamianos, Member, IEEE, and Petros Maragos, Fellow,

More information

VQ Source Models: Perceptual & Phase Issues

VQ Source Models: Perceptual & Phase Issues VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com

More information

Calibration of Microphone Arrays for Improved Speech Recognition

Calibration of Microphone Arrays for Improved Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present

More information

Auditory Based Feature Vectors for Speech Recognition Systems

Auditory Based Feature Vectors for Speech Recognition Systems Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR Syu-Siang Wang 1, Jeih-weih Hung, Yu Tsao 1 1 Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan Dept. of Electrical

More information

Speech Enhancement Using a Mixture-Maximum Model

Speech Enhancement Using a Mixture-Maximum Model IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a Mixture-Maximum Model David Burshtein, Senior Member, IEEE, and Sharon Gannot, Member, IEEE

More information

Relative phase information for detecting human speech and spoofed speech

Relative phase information for detecting human speech and spoofed speech Relative phase information for detecting human speech and spoofed speech Longbiao Wang 1, Yohei Yoshida 1, Yuta Kawakami 1 and Seiichi Nakagawa 2 1 Nagaoka University of Technology, Japan 2 Toyohashi University

More information

A DEVICE FOR AUTOMATIC SPEECH RECOGNITION*

A DEVICE FOR AUTOMATIC SPEECH RECOGNITION* EVICE FOR UTOTIC SPEECH RECOGNITION* ats Blomberg and Kjell Elenius INTROUCTION In the following a device for automatic recognition of isolated words will be described. It was developed at The department

More information

Low Power Microphone Acquisition and Processing for Always-on Applications Based on Microcontrollers

Low Power Microphone Acquisition and Processing for Always-on Applications Based on Microcontrollers Low Power Microphone Acquisition and Processing for Always-on Applications Based on Microcontrollers Architecture I: standalone µc Microphone Microcontroller User Output Microcontroller used to implement

More information

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement 1 Zeeshan Hashmi Khateeb, 2 Gopalaiah 1,2 Department of Instrumentation

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Automotive three-microphone voice activity detector and noise-canceller

Automotive three-microphone voice activity detector and noise-canceller Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR

More information

Cepstrum alanysis of speech signals

Cepstrum alanysis of speech signals Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP

More information

Separating Voiced Segments from Music File using MFCC, ZCR and GMM

Separating Voiced Segments from Music File using MFCC, ZCR and GMM Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.

More information

Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition

Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition Sriram Ganapathy 1, Samuel Thomas 1 and Hynek Hermansky 1,2 1 Dept. of ECE, Johns Hopkins University, USA 2 Human Language Technology

More information

EECS 452 Midterm Exam Winter 2012

EECS 452 Midterm Exam Winter 2012 EECS 452 Midterm Exam Winter 2012 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Section I /40 Section II

More information

Next Generation Wireless Communication System

Next Generation Wireless Communication System Next Generation Wireless Communication System - Cognitive System and High Speed Wireless - Yoshikazu Miyanaga Distinguished Lecturer, IEEE Circuits and Systems Society Hokkaido University Laboratory of

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi

More information

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John

More information

Online Blind Channel Normalization Using BPF-Based Modulation Frequency Filtering

Online Blind Channel Normalization Using BPF-Based Modulation Frequency Filtering Online Blind Channel Normalization Using BPF-Based Modulation Frequency Filtering Yun-Kyung Lee, o-young Jung, and Jeon Gue Par We propose a new bandpass filter (BPF)-based online channel normalization

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Mikko Myllymäki and Tuomas Virtanen

Mikko Myllymäki and Tuomas Virtanen NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,

More information

Modulation Spectrum Power-law Expansion for Robust Speech Recognition

Modulation Spectrum Power-law Expansion for Robust Speech Recognition Modulation Spectrum Power-law Expansion for Robust Speech Recognition Hao-Teng Fan, Zi-Hao Ye and Jeih-weih Hung Department of Electrical Engineering, National Chi Nan University, Nantou, Taiwan E-mail:

More information

VLSI Implementation of Digital Down Converter (DDC)

VLSI Implementation of Digital Down Converter (DDC) Volume-7, Issue-1, January-February 2017 International Journal of Engineering and Management Research Page Number: 218-222 VLSI Implementation of Digital Down Converter (DDC) Shaik Afrojanasima 1, K Vijaya

More information

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Power Normalized Cepstral Coefficient for Speaker Diarization and Acoustic Echo Cancellation

Power Normalized Cepstral Coefficient for Speaker Diarization and Acoustic Echo Cancellation Power Normalized Cepstral Coefficient for Speaker Diarization and Acoustic Echo Cancellation Sherbin Kanattil Kassim P.G Scholar, Department of ECE, Engineering College, Edathala, Ernakulam, India sherbin_kassim@yahoo.co.in

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION

A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION

More information

IMPROVEMENTS TO THE IBM SPEECH ACTIVITY DETECTION SYSTEM FOR THE DARPA RATS PROGRAM

IMPROVEMENTS TO THE IBM SPEECH ACTIVITY DETECTION SYSTEM FOR THE DARPA RATS PROGRAM IMPROVEMENTS TO THE IBM SPEECH ACTIVITY DETECTION SYSTEM FOR THE DARPA RATS PROGRAM Samuel Thomas 1, George Saon 1, Maarten Van Segbroeck 2 and Shrikanth S. Narayanan 2 1 IBM T.J. Watson Research Center,

More information

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement Optimal Adaptive Filtering Technique for Tamil Speech Enhancement Vimala.C Project Fellow, Department of Computer Science Avinashilingam Institute for Home Science and Higher Education and Women Coimbatore,

More information

Implementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal

Implementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal Implementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal Abstract: MAHESH S. CHAVAN, * NIKOS MASTORAKIS, MANJUSHA N. CHAVAN, *** M.S. GAIKWAD Department of Electronics

More information

IMPLEMENTATION OF SPEECH RECOGNITION SYSTEM USING DSP PROCESSOR ADSP2181

IMPLEMENTATION OF SPEECH RECOGNITION SYSTEM USING DSP PROCESSOR ADSP2181 IMPLEMENTATION OF SPEECH RECOGNITION SYSTEM USING DSP PROCESSOR ADSP2181 1 KALPANA JOSHI, 2 NILIMA KOLHARE & 3 V.M.PANDHARIPANDE 1&2 Dept.of Electronics and Telecommunication Engg, Government College of

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,

More information

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment BABU et al: VOICE ACTIVITY DETECTION ALGORITHM FOR ROBUST SPEECH RECOGNITION SYSTEM Journal of Scientific & Industrial Research Vol. 69, July 2010, pp. 515-522 515 Performance analysis of voice activity

More information

MOST MODERN automatic speech recognition (ASR)

MOST MODERN automatic speech recognition (ASR) IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 5, SEPTEMBER 1997 451 A Model of Dynamic Auditory Perception and Its Application to Robust Word Recognition Brian Strope and Abeer Alwan, Member,

More information

24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY /$ IEEE

24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY /$ IEEE 24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY 2009 Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation Jiucang Hao, Hagai

More information

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor Presented by Amir Kiperwas 1 M-element microphone array One desired source One undesired source Ambient noise field Signals: Broadband Mutually

More information

Robustness (cont.); End-to-end systems

Robustness (cont.); End-to-end systems Robustness (cont.); End-to-end systems Steve Renals Automatic Speech Recognition ASR Lecture 18 27 March 2017 ASR Lecture 18 Robustness (cont.); End-to-end systems 1 Robust Speech Recognition ASR Lecture

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech Project Proposal Avner Halevy Department of Mathematics University of Maryland, College Park ahalevy at math.umd.edu

More information

Speech Enhancement Techniques using Wiener Filter and Subspace Filter

Speech Enhancement Techniques using Wiener Filter and Subspace Filter IJSTE - International Journal of Science Technology & Engineering Volume 3 Issue 05 November 2016 ISSN (online): 2349-784X Speech Enhancement Techniques using Wiener Filter and Subspace Filter Ankeeta

More information

Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks

Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks Australian Journal of Basic and Applied Sciences, 4(7): 2093-2098, 2010 ISSN 1991-8178 Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks 1 Mojtaba Bandarabadi,

More information

DERIVATION OF TRAPS IN AUDITORY DOMAIN

DERIVATION OF TRAPS IN AUDITORY DOMAIN DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.

More information

Cascaded Noise-Shaping Modulators for Oversampled Data Conversion

Cascaded Noise-Shaping Modulators for Oversampled Data Conversion Cascaded Noise-Shaping Modulators for Oversampled Data Conversion Bruce A. Wooley Stanford University B. Wooley, Stanford, 2004 1 Outline Oversampling modulators for A/D conversion Cascaded noise-shaping

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Design of Over GIGA bit Wireless LSI systems

Design of Over GIGA bit Wireless LSI systems Design of Over GIGA bit Wireless LSI systems Yoshikazu Miyanaga Hokkaido University Laboratory of Information Communication Networks Graduate School of Information Science and Technology Sapporo 060-0814,

More information

Autonomous Vehicle Speaker Verification System

Autonomous Vehicle Speaker Verification System Autonomous Vehicle Speaker Verification System Functional Requirements List and Performance Specifications Aaron Pfalzgraf Christopher Sullivan Project Advisor: Dr. Jose Sanchez 4 November 2013 AVSVS 2

More information

Performance Analysis of Acoustic Echo Cancellation in Sound Processing

Performance Analysis of Acoustic Echo Cancellation in Sound Processing 2016 IJSRSET Volume 2 Issue 3 Print ISSN : 2395-1990 Online ISSN : 2394-4099 Themed Section: Engineering and Technology Performance Analysis of Acoustic Echo Cancellation in Sound Processing N. Sakthi

More information

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio Topic Spectrogram Chromagram Cesptrogram Short time Fourier Transform Break signal into windows Calculate DFT of each window The Spectrogram spectrogram(y,1024,512,1024,fs,'yaxis'); A series of short term

More information

SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM

SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM MAY 21 ABSTRACT Although automatic speech recognition systems have dramatically improved in recent decades,

More information

Research Article DOA Estimation with Local-Peak-Weighted CSP

Research Article DOA Estimation with Local-Peak-Weighted CSP Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 21, Article ID 38729, 9 pages doi:1.11/21/38729 Research Article DOA Estimation with Local-Peak-Weighted CSP Osamu

More information

The Application of System Generator in Digital Quadrature Direct Up-Conversion

The Application of System Generator in Digital Quadrature Direct Up-Conversion Communications in Information Science and Management Engineering Apr. 2013, Vol. 3 Iss. 4, PP. 192-19 The Application of System Generator in Digital Quadrature Direct Up-Conversion Zhi Chai 1, Jun Shen

More information

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,

More information

Audio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23

Audio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23 Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal

More information

MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM

MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM www.advancejournals.org Open Access Scientific Publisher MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM ABSTRACT- P. Santhiya 1, T. Jayasankar 1 1 AUT (BIT campus), Tiruchirappalli, India

More information

A VLSI Design of a Tomlinson-Harashima Precoder for MU-MIMO Systems Using Arrayed Pipelined Processing

A VLSI Design of a Tomlinson-Harashima Precoder for MU-MIMO Systems Using Arrayed Pipelined Processing 2114 IEICE TRANS. FUNDAMENTALS, VOL.E96 A, NO.11 NOVEMBER 2013 PAPER Special Section on Smart Multimedia & Communication Systems A VLSI Design of a Tomlinson-Harashima Precoder for MU-MIMO Systems Using

More information

A Simple Two-Microphone Array Devoted to Speech Enhancement and Source Tracking

A Simple Two-Microphone Array Devoted to Speech Enhancement and Source Tracking A Simple Two-Microphone Array Devoted to Speech Enhancement and Source Tracking A. Álvarez, P. Gómez, R. Martínez and, V. Nieto Departamento de Arquitectura y Tecnología de Sistemas Informáticos Universidad

More information