Robust Speech Recognition and its ROBOT implementation
|
|
- Herbert Chase
- 6 years ago
- Views:
Transcription
1 Robust Speech Recognition and its ROBOT implementation Yoshikazu Miyanaga Hokkaido University
2 Conditions for Speech Recognition Short Isolated Speech: words, phrase (<2sec) Attached Mic (several cm 10cm) Continuous Speech: sentences (>2sec) Remote Mic: (10cm 5m) Silent Room (>20dB) Living Room(20 ~10dB) Long Distance Mic: (>5m) Noisy Room: exhibition(<10db)
3 Conventional ASR Continuous Speech: (>2sec) Attached Mic (<10cm) Silent Room (>20dB) Short Isolated Speech: (<2sec) Attached Mic (<10cm) Living Room(20 ~10dB) Array Microphone Short Isolated Speech: (<2sec) Attached Remote Mic: (<5m) Living Room(20 ~10dB)
4 Hokkaido University Speech Communication System (HU-SCS) Short Isolated Speech: words, phrase (<2sec) Long Distance Mic: (>5m) Attached Mic (several cm 10cm) Remote Mic: (10cm 5m) Noisy Room: exhibition(<10db) Silent Room (>20dB) Living Room(20 ~10dB)
5 HU-SCS Automatic Speech Detection
6 97% by Current Technology (SNR 10dB) HU-SCS WAVELET Non-Linear Processing Robust voice activity detection using perceptual wavelet-packet transform and teager energy operator S-H Chen, H-T Wu, Y. Chang and T.K. Truong, Trans. Pattern Recognition Letters (2007) Automatic Speech Detection
7 HU-SCS HU-SCS v4 99% over SNR 10dB BP+Threshold Ope F 0 Detection Automatic Speech Detection
8 HU-SCS Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?
9 71% by Current Tech (SNR 10dB). 97.4% (SNR 20dB). Spectral Subtraction RASTA, CMS A Prior Information HU-SCS Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?
10 HU-SCS HU-SCS v4 95.3% (SNR 10dB). 98.3% (20dB) No A Prior Info. RSF/DRA Automatic Speech Recognition Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you?
11 HU-SCS Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? Automatic Speech Rejection Recognition Result Good Morning
12 90% by Current Tech Confidential Scoring HU-SCS Technique Recognition confidential scoring and its use in speech understanding systems, T.J. Hazen, S.Seneff and J.Polifroni, Trans on Computer Speech and language (2002). Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? Automatic Speech Rejection Recognition Result Good Morning
13 HU-SCS Candidates of Recognition Results (1) Good Morning (2) See you (3) How are you? HU-SCS v4 Dependent GMM by Weighted HMM (90% Accuracy) AI (Artificial Intelligence) Automatic Speech Rejection Recognition Result Good Morning
14 HU-SCS First SCS HW LSI IP Mobile Intelligent Consumer Electronics etc Fine Advantage Automatic Automatic HW with Speech Speech Low Detection Power Rejection Recognition (1) Mobile Appli Small Low Power (2) PC free Super Low-Power Consumption Design Real-Time SCS 180nsec/word (10MHz クロック ) Recognition Time Small Scale Design with Special Designed LSI Noise Reduction by Array Microphone
15 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection
16 Running Spectrum Domain Waveform Mel-Spectra t t-6
17 BP and Threshold OP End Point Start Point
18 Detection Switch-Less Recognition System by Automatic Detection Speech Recognition Operation/Control Recognition Hands Free Operation/Control 無音区間 Start Recognition Start End 無音区間 Recognition End Recognition Operation/Control
19 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection
20 Speech Analysis and Robust Processing Speech Analysis LPC Cepstrum Mel-Frequency Cepstrum Robust Processing Various types of techniques have been proposed. Spectral Subtraction Wiener Filtering Microphone Arrays RSF/DRA (Running Spectrum Filtering/Dynamic Range Adjustment) uses filtering and normalizing for cepstral vectors.
21 Procedure of Mel-Frequency Cepstrum Speech Signals x(t) Cut into Short-Time Frames x f (n,t s ) Discrete Fourier Transform (DFT) Filterbanks with Mel-Frequency Scale Logarithm X(n,f) X s (n,f m ) log(x s (n,f m )) Discrete Cosine Transform (DCT) C(n,k) Cepstral Coefficients n : frame index k : cepstral order
22 Noise modeling Spectrum including noise can be modeled as, X ( n, ) S( n, ) H( ) A( ) Clean spectrum Multiplicative noise Additive noise log E( n, ) S( n, ) H( ) log X ( n, ) log( E( n, ) A( ))
23 Noise Corruption in Power Spectrum Noise corruptions make differences on gains and DC components. Power Spectrum Clean Speech E(n,ω)+A E(n,ω) Noisy Speech (White Noise at 10dB SNR)
24 Noise Corruption in Log Power Spectrum Noise corruptions make differences on gains and DC components. Log-power Spectrum E(n,ω)+A Clean Speech DC Components E(n,ω) Gain Noisy Speech (White Noise at 10dB SNR)
25 Running Spectrum Running spectrum is obtained by accumulating short-time spectrum DFT Running spectrum: time trajectory of frequency Frequency Frame Number
26 Spectral Subtraction Running spectrum of a noisy speech (white noise at 5 db SNR) After Subtraction Estimate the spectrum of noise from short-time spectra in the first several flames Subtract the estimated spectrum from each short-time spectrum
27 Noise Reduction Techniques Conventional method Spectral subtraction Parameters are not optimized for speeches from various environments. Excessive subtraction may cause musical noise. Robust speech feature extraction. Advanced speech analysis using RSF (running spectral filtering) and DRA (dynamic range adjustment).
28 Modulation Spectrum RSF focuses on modulation spectrum Running Spectrum Modulation spectrum: spectrum versus time trajectory of frequency. Modulation Spectrum Frequency Frame Number DFT on each frequency Frequency Modulation frequency
29 Mod-F of Clean and Noisy Speech Speech components are dominant around 4 Hz in modulation spectrum. Clean Noisy (white noise at 5 db SNR) Lower modulation frequency components can be assumed as noise because of little changes in noise components.
30 Frequency (Hz) RSF (Running Spectrum Filtering) Speech components are dominant around 4 Hz in modulation spectrum. Modulation Spectrum Noise Components Speech Components Modulation Frequency [Hz] Unnecessary Part
31 RSF RSF (Running Spectrum Filtering) enhances perceptual auditory components. decreases noise components relatively by bandpass filtering in cepstral sequences. ~ C( n, k) Modulation Frequency of RSF Q h( i) C( n i 0 i, k) Coefficients in FIR Filter RASTA(IIR) RSF
32 DRA DRA (Dynamic Range Adjustment) normalizes amplitude of cepstral vectors in time domain (use of maximum value during utterance). suppresses dynamic range distortions caused by additive noise. C ( n, k) k ~ C( n, k k) ~ max C( n, k) 1 k T
33 RSF / DRA Comparison in cepstral time-trajectories at 4th order Clean Noisy Baseline RSF/DRA processing
34 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection
35 Likelihoods of HMM Average HMM Variance GMM GMM GMM GMM GMM Approximation of many multidimensional Gaussian Distribution
36 Evaluation on Likelihoods MFCC p 1 p 2 p p 3 4 p 5 p 6 p 7 p 8 p 9 p p Likelihood of MFCC into this HMM The maximum likelihood is selected and its label is recognized as the result. The result is correct, isn t it?
37 Likelihood Likelihood Evaluation of Reliability The result of the top score is trusted. The result of the top score is NOT trusted.
38 Rejection Method using Multi-Criterions Tendency Ratio Maximum Score Square of Ratio MFCC Cluster Group Evaluation of Cluster New Type Speech Rejection Noisy Conditions
39 HU-SCS Automatic Speech Detection HW with Low Power Automatic Speech Recognition Automatic Speech Rejection
40 Overview of ASR System Current ASR systems adopt robust processing that removes influences of noise distortions. Speech Feature Vectors Calculate Probability (likelihood Covert to Spectrum or Cepstrum scores) Speech Data Speech Analysis Robust Processing Decrease Noise Distortions Speech Recognition Results Reference Models Prepare Reference Patterns by Speech Training
41 Circuit Structure of Complete Recognition System Speech Signal Robust Processing SRAM Speech Recognition Data Control System Control External Memory (SRAM) from/to Processor Speech Analysis SRAM SRAM
42 Circuit Implementations Required Operating Performance Speech Analysis 10 MIPS Robust Processing 500 MIPS (mainly in FIR) FFT IDCT FIR Log Divider 8 Buffer Buffer Cos/Sin ROM Speech Data 256*16 bits ROM RAM 512*24 bits Speech Analysis (MFCC) 4096*16 bits RAM 256*24 bits Robust Processing (RSF/DRA) Feature Vectors
43 Block Diagram Interfaces Microprocessor, External RAM, and Master/Slave MPU Interface Master Bus Interrupt Signal Filter Coefficients for RSF CLK SW Bus Control RSF/DRA SRAM HMM System Control RESET SRAM interface Address MFCC SRAM SRAM Data Control Chip Select Slave Bus Working for MFCC and RSF Data Control All right reserved. Copyright Feature Yoshikazu parameters Miyanaga before speech detection
44 New Scalable Architectures 2 types of scalable techniques are applied to the system. (1) Multiple Process Elements (PEs) in HMM Circuit The PEs enable high-speed processing and improving recognition performance. (2) Master/Slave Operation in the Complete System The operation enables high-speed processing and increase the number of word vocabularies.
45 HMM (Hidden Markov Models) qn Hidden Markov Models (HMM) Statistical modeling approach using Markov chain. Powerful for expressing time-varying data sequences and robust with speaker differences. ( 1 n N ) Set of states a 33 a 44 a 12 a23 a 34 a45 q1 q q 2 3 q 4 a11 a22 aij State transition probability b ( ) b ( ) b N (k) 1 k 2 k Output probability
46 Full-Parallel Computations in HMM The output probabilities and temporal scores can be computed concurrently for the number of HMM states. Output Prob. Calc. Score Calc. o t Output Prob. Calc. Output Prob. Calc. Score Calc. Score Calc. Select Max Max(δ) Path for upper state Output Prob. Calc. Score Calc.
47 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM
48 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [4] RAM (3) Broadcast Slave1 [3] (4) Speech Recognition Slave2 [2] (5) Gather Results Slave3 [1]
49 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM
50 Master/Slave Operation (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [1] RAM (3) Broadcast Slave1 (4) Speech Recognition (5) Gather Results Slave2 Slave3 [2]
51 Master/Slave Operation(2) (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing (3) Broadcast (4) Speech Recognition (5) Gather Results Master Slave1 Slave2 Slave3 RAM
52 Master/Slave Operation(2) (1) Set Reference Data Microprocessor (2) Speech Analysis and Robust Processing Master [4] RAM (3) Broadcast Slave1 [3] (4) Speech Recognition Slave2 [2] (5) Gather Results Slave3 [1]
53 Circuit Design (Analysis & HMM TEG) Technology Rohm CMOS 0.35 μm Univ. of Tokyo EXD Standard Cell Library Voltage Supply 3.3V RTL Level Design.Verilog-HDL Evaluation V2 Layout View Clock Freq. (MHz) Proc Time (ms/word) Power Coms (mw)
54 Comparison on Power Consumption Proposed HW (10MHz) and DSP Design (80MIPS) Processor Structure DSP based System TMS320C549 80MIPS Proposed System Dedicated Processor 10MHz Memory Access Time (ns) Processor (mw) (Core : 3.3V) Memory (mw) (SRAM, Core : 3.3V) Total
55 Processing Time of HU-SCS Comparison with Software Design 54 times faster No high speed clock Useful for Low-Power Design Proposed System (Hardware) Pentium 4 (Software) No. arithmetic units No. cycles 455,200 - Frequency(MHz) Recognition Processing time(ms)
56 Design by Standard Cells TSMC0.25µm CMOS Standard Cell Voltage 2.5V Highest Clock Rate 80.6MHz (12.4ns, Temperature Cond. Typical) No. Parallel Processing 32 8 HMM 491, ,980 RSF/DRA 11,910 MFCC 39,670 System Control 18,310 Bus Control 1,310 SRAM 63,400 Total 626, ,580
57 Current HU-SCS PC Interface with HU-SCS Board HU-SCS Board 55mm 44 mm
58 Overview of Current HU-SCS Improvement of Noise Robust Accurate ASR under SNR 0-10dB Robustness against Echo Improvement of Speech Recognition Higher Accuracy on MFCC Calculation Low Power Design and Higher Speed Processing Improvement of Total HW System Higher Speed Response Time
59 Comparison on Performance Environment Noise Level Correctness Current Previous Meeting Room 50dB 96.4% 90.0% Elevator 50dB 95.0% 84.4% Stairs 45dB 85.1% 50.5% Car A(Idling, No-Moving) 50dB 99.4% 95.6% Car B(High Speed, Open Window) 75dB 93.3% 85.0% Car C(High Speed, Audio ON(FM)) 75dB 88.9% 65.6% Total 93.0% 78.5% Cruiser Board(Outside, high speed) 80dB 82.7% - Comparisons between HU-SCS v4 and v % 50.00% 0.00% Previous Current
60 Results on Some Distances Car A Car B Car C 100.0% 100.0% 100.0% 90.0% 90.0% 90.0% 80.0% 80.0% 70.0% 80.0% 70.0% 70.0% 60.0% 50.0% 40.0% 60.0% 30cm 60cm 90cm 60.0% 30cm 60cm 90cm 30.0% 30cm 60cm 90cm Meeting Room Elevator Stair 100.0% 100.0% 100.0% 90.0% 90.0% 90.0% 80.0% 80.0% 70.0% 80.0% 70.0% 70.0% 60.0% 50.0% 40.0% 60.0% 30cm 60cm 90cm 60.0% 30cm 60cm 90cm 30.0% 30cm 60cm 90cm
61 Robot Implementation Speech Recognition & Synthesis Quick Response Control to Consumer Electronics and Machines
62 Communications and Controls
63 Summary Hokkaido University Speech Communication System Integrated Architecture of Speech Detection, Robust Speech Analysis, Speech Recognition, Speech Rejection Higher Speed Processing than DSP and Software Superior in Energy Saving than DSP Solutions Improving Noise Robustness by RSF/DRA Technique Small, Fast and Low Power
64 Who? Yoshikazu Miyanaga He received the B.S., M.S., and Dr. Eng. degrees from Hokkaido University, Sapporo, Japan, in 1979, 1981, and 1986, respectively. He is currently a Professor at Graduate School of Information Science and Technology, Hokkaido University. His research interests are in the areas of signal processing for wireless communications, nonlinear signal processing and low-power LSI systems. He was a chair of Technical Group on Smart Info-Media System, IEICE. He is an advisory member of this technical group. Currently, he is IEICE fellow. He served as a member in the board of directors, IEEE Japan Council as a chair of student activity committee from 2002 to He is a chair of student activity committee in IEEE Sapporo Section from He is a chair of IEEE Circuits and Systems Society, Digital Signal Processing Technical Committee from He has been serving as international steering committee chairs/members of IEEE ISPACS, IEEE ISCIT, IEEE/EURASIP NSIP and honorary/general chairs/co-chairs of their international symposiums/workshops, i.e., ISPACS 2003, ISCIT 2004, ISCIT 2005, NSIP 2005, ISPACS 2008, ISMAC 2009 and APSIPA ASC He also served as international organizing committee chairs of IEICE ITC-CSCC , IEEE MSCAS 2004, IEEE ISCAS
65 Current References of this Topic 1. Kazunaga Ohnuki, Wataru Takahashi, Shingo Yoshizawa, Yoshikazu Miyanaga, Noise Robust Speech Features for Automatic Continuous Speech Recognition using Running Spectrum Analysis, Proceedings of 2008 International Symposium on Communications and Information Technologies (ISCIT), pp , October Jirabhorn Chaiwongsai, Werapon Chiracharit, Kosin Chamnongthai, Yoshikazu Miyanaga, An Architecture of HMM-Based Isolated-Word Speech Recognition with Tone Detection Function, Proceedings of 2008 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), December Nongnuch Suktangman, Kham Khanthavivone, Kraisin Songwatana, Yoshikazu Miyanaga, Robust Speech Recognition Based on Speech Spectrum on Bark Scale, EURASIP Proceedings of 2007 International Workshop on Nonlinear Signal and Image Processing (NSIP), pp , September Shingo Yoshizawa, Naoya Wada, Noboru Hayasaka, Yoshikazu Miyanaga, "Scalable Architecture for Word HMM-Based Speech Recognition and VLSI Implementation in Complete System", IEEE Transactions on Circuits and Systems I, Vol.53, No.1, pp.70-77, January Noboru Hayasaka and Yoshikazu Miyanaga, Spectrum Filtering with FRM for Robust Speech Recognition, IEEE Proceedings of International Symposium on Circuits and Systems (ISCAS), No.2, pp , May Naoya Wada, Noboru Hayasaka, Shingo Yoshizawa, Yoshikazu Miyanaga, Direct Control on Modulation Spectrum for Noise-Robust Speech Recognition and Spectral Subtraction, IEEE International Symposium on Circuits and Systems (ISCAS), pp , May Shingo Yoshizawa, Noboru Hayasaka, Naoya Wada, Yoshikazu Miyanaga, VLSI Architecture for Robust Speech Recognition Systems and its Implementation in Verification Platform, Journal of Robotics and Mechatronics, Vol.17, No.4, pp , Aug Yasuyuki Hatakawa, Shingo Yoshizawa, Yoshikazu Miyanaga, Robust VLSI Architecture for System-On-Chip Design and its implementation in Viterbi Decoder, IEEE International Symposium on Circuits and Systems (ISCAS), Vol.3, pp.25-28, May K.Songwatana, K. Dejhan, Y. Miyanaga and K. Khanthavivone, A Vowels Recognition Model for Laotion language using Transfer Function on Bark scale and Hidden Markov Modeling, IEEE Proceedings of International Workshop on Nonlinear Signal and Image Processing (NSIP), Vol.1, pp , May Kazuma Fujioka,Noboru Hayasaka,Yoshikazu Miyanaga and Norinobu Yoshida, A Noise Reduction Method of Speech Signals Using Running Spectrum Filtering, IEICE Transactions on Information and Systems Part.2,Vol.J88-D-Ⅱ, No.4,pp ,April Qi Zhu, Noriyuki Ohtsuki, Yoshikazu Miyanaga and Norinobu Yoshida, Noise-Robust Speech Analysis Using Running Spectrum Filtering, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Science, Vol.E-88-A, No.2, pp , February
A Real Time Noise-Robust Speech Recognition System
A Real Time Noise-Robust Speech Recognition System 7 A Real Time Noise-Robust Speech Recognition System Naoya Wada, Shingo Yoshizawa, and Yoshikazu Miyanaga, Non-members ABSTRACT This paper introduces
More informationRobust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping
100 ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005 Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping Naoya Wada, Shingo Yoshizawa, Noboru
More informationIsolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques
Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques 81 Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Noboru Hayasaka 1, Non-member ABSTRACT
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationEnhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients
ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationAN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute
More informationSpeech Recognition on Robot Controller
Speech Recognition on Robot Controller Implemented on FPGA Phan Dinh Duy, Vu Duc Lung, Nguyen Quang Duy Trang, and Nguyen Cong Toan University of Information Technology, National University Ho Chi Minh
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationSYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE
SYNTHETIC SPEECH DETECTION USING TEMPORAL MODULATION FEATURE Zhizheng Wu 1,2, Xiong Xiao 2, Eng Siong Chng 1,2, Haizhou Li 1,2,3 1 School of Computer Engineering, Nanyang Technological University (NTU),
More informationRASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991
RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response
More informationHigh-speed Noise Cancellation with Microphone Array
Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent
More informationI D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b
R E S E A R C H R E P O R T I D I A P On Factorizing Spectral Dynamics for Robust Speech Recognition a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-33 June 23 Iain McCowan a Hemant Misra a,b to appear in
More informationAutomatic Morse Code Recognition Under Low SNR
2nd International Conference on Mechanical, Electronic, Control and Automation Engineering (MECAE 2018) Automatic Morse Code Recognition Under Low SNR Xianyu Wanga, Qi Zhaob, Cheng Mac, * and Jianping
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationRobust telephone speech recognition based on channel compensation
Pattern Recognition 32 (1999) 1061}1067 Robust telephone speech recognition based on channel compensation Jiqing Han*, Wen Gao Department of Computer Science and Engineering, Harbin Institute of Technology,
More informationPerformance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System
Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System C.GANESH BABU 1, Dr.P..T.VANATHI 2 R.RAMACHANDRAN 3, M.SENTHIL RAJAA 3, R.VENGATESH 3 1 Research Scholar (PSGCT)
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and
More informationComparison of Spectral Analysis Methods for Automatic Speech Recognition
INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationPower Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition
Power Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition Chanwoo Kim 1 and Richard M. Stern Department of Electrical and Computer Engineering and Language Technologies
More informationRobust Low-Resource Sound Localization in Correlated Noise
INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem
More informationI D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b
R E S E A R C H R E P O R T I D I A P Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-47 September 23 Iain McCowan a Hemant Misra a,b to appear
More informationEffective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a
R E S E A R C H R E P O R T I D I A P Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a IDIAP RR 7-7 January 8 submitted for publication a IDIAP Research Institute,
More informationUsing RASTA in task independent TANDEM feature extraction
R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t
More informationAn Improved Voice Activity Detection Based on Deep Belief Networks
e-issn 2455 1392 Volume 2 Issue 4, April 2016 pp. 676-683 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com An Improved Voice Activity Detection Based on Deep Belief Networks Shabeeba T. K.
More informationInternational Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015
RESEARCH ARTICLE OPEN ACCESS A Comparative Study on Feature Extraction Technique for Isolated Word Speech Recognition Easwari.N 1, Ponmuthuramalingam.P 2 1,2 (PG & Research Department of Computer Science,
More informationA Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification
A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationFundamental frequency estimation of speech signals using MUSIC algorithm
Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,
More informationCS 188: Artificial Intelligence Spring Speech in an Hour
CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch
More informationA Novel Speech Controller for Radio Amateurs with a Vision Impairment
IEEE TRANSACTIONS ON REHABILITATION ENGINEERING, VOL. 8, NO. 1, MARCH 2000 89 A Novel Speech Controller for Radio Amateurs with a Vision Impairment Chih-Lung Lin, Bo-Ren Bai, Li-Chun Du, Cheng-Tao Hu,
More informationClassification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise
Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to
More informationTime-Frequency Distributions for Automatic Speech Recognition
196 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 3, MARCH 2001 Time-Frequency Distributions for Automatic Speech Recognition Alexandros Potamianos, Member, IEEE, and Petros Maragos, Fellow,
More informationVQ Source Models: Perceptual & Phase Issues
VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu
More informationRECENTLY, there has been an increasing interest in noisy
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In
More informationSPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT
SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com
More informationCalibration of Microphone Arrays for Improved Speech Recognition
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present
More informationAuditory Based Feature Vectors for Speech Recognition Systems
Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines
More informationAuditory modelling for speech processing in the perceptual domain
ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract
More informationA STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR
A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR Syu-Siang Wang 1, Jeih-weih Hung, Yu Tsao 1 1 Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan Dept. of Electrical
More informationSpeech Enhancement Using a Mixture-Maximum Model
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a Mixture-Maximum Model David Burshtein, Senior Member, IEEE, and Sharon Gannot, Member, IEEE
More informationRelative phase information for detecting human speech and spoofed speech
Relative phase information for detecting human speech and spoofed speech Longbiao Wang 1, Yohei Yoshida 1, Yuta Kawakami 1 and Seiichi Nakagawa 2 1 Nagaoka University of Technology, Japan 2 Toyohashi University
More informationA DEVICE FOR AUTOMATIC SPEECH RECOGNITION*
EVICE FOR UTOTIC SPEECH RECOGNITION* ats Blomberg and Kjell Elenius INTROUCTION In the following a device for automatic recognition of isolated words will be described. It was developed at The department
More informationLow Power Microphone Acquisition and Processing for Always-on Applications Based on Microcontrollers
Low Power Microphone Acquisition and Processing for Always-on Applications Based on Microcontrollers Architecture I: standalone µc Microphone Microcontroller User Output Microcontroller used to implement
More informationFrequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement
Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement 1 Zeeshan Hashmi Khateeb, 2 Gopalaiah 1,2 Department of Instrumentation
More informationMultimedia Signal Processing: Theory and Applications in Speech, Music and Communications
Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal
More informationRobust Voice Activity Detection Based on Discrete Wavelet. Transform
Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper
More informationAutomotive three-microphone voice activity detector and noise-canceller
Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR
More informationCepstrum alanysis of speech signals
Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP
More informationSeparating Voiced Segments from Music File using MFCC, ZCR and GMM
Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.
More informationFeature Extraction Using 2-D Autoregressive Models For Speaker Recognition
Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition Sriram Ganapathy 1, Samuel Thomas 1 and Hynek Hermansky 1,2 1 Dept. of ECE, Johns Hopkins University, USA 2 Human Language Technology
More informationEECS 452 Midterm Exam Winter 2012
EECS 452 Midterm Exam Winter 2012 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Section I /40 Section II
More informationNext Generation Wireless Communication System
Next Generation Wireless Communication System - Cognitive System and High Speed Wireless - Yoshikazu Miyanaga Distinguished Lecturer, IEEE Circuits and Systems Society Hokkaido University Laboratory of
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi
More informationAnnouncements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.
Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John
More informationOnline Blind Channel Normalization Using BPF-Based Modulation Frequency Filtering
Online Blind Channel Normalization Using BPF-Based Modulation Frequency Filtering Yun-Kyung Lee, o-young Jung, and Jeon Gue Par We propose a new bandpass filter (BPF)-based online channel normalization
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationAdaptive Filters Application of Linear Prediction
Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing
More informationMikko Myllymäki and Tuomas Virtanen
NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,
More informationModulation Spectrum Power-law Expansion for Robust Speech Recognition
Modulation Spectrum Power-law Expansion for Robust Speech Recognition Hao-Teng Fan, Zi-Hao Ye and Jeih-weih Hung Department of Electrical Engineering, National Chi Nan University, Nantou, Taiwan E-mail:
More informationVLSI Implementation of Digital Down Converter (DDC)
Volume-7, Issue-1, January-February 2017 International Journal of Engineering and Management Research Page Number: 218-222 VLSI Implementation of Digital Down Converter (DDC) Shaik Afrojanasima 1, K Vijaya
More informationSONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS
SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R
More informationPerformance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches
Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationPower Normalized Cepstral Coefficient for Speaker Diarization and Acoustic Echo Cancellation
Power Normalized Cepstral Coefficient for Speaker Diarization and Acoustic Echo Cancellation Sherbin Kanattil Kassim P.G Scholar, Department of ECE, Engineering College, Edathala, Ernakulam, India sherbin_kassim@yahoo.co.in
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationA CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION
More informationIMPROVEMENTS TO THE IBM SPEECH ACTIVITY DETECTION SYSTEM FOR THE DARPA RATS PROGRAM
IMPROVEMENTS TO THE IBM SPEECH ACTIVITY DETECTION SYSTEM FOR THE DARPA RATS PROGRAM Samuel Thomas 1, George Saon 1, Maarten Van Segbroeck 2 and Shrikanth S. Narayanan 2 1 IBM T.J. Watson Research Center,
More informationOptimal Adaptive Filtering Technique for Tamil Speech Enhancement
Optimal Adaptive Filtering Technique for Tamil Speech Enhancement Vimala.C Project Fellow, Department of Computer Science Avinashilingam Institute for Home Science and Higher Education and Women Coimbatore,
More informationImplementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal
Implementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal Abstract: MAHESH S. CHAVAN, * NIKOS MASTORAKIS, MANJUSHA N. CHAVAN, *** M.S. GAIKWAD Department of Electronics
More informationIMPLEMENTATION OF SPEECH RECOGNITION SYSTEM USING DSP PROCESSOR ADSP2181
IMPLEMENTATION OF SPEECH RECOGNITION SYSTEM USING DSP PROCESSOR ADSP2181 1 KALPANA JOSHI, 2 NILIMA KOLHARE & 3 V.M.PANDHARIPANDE 1&2 Dept.of Electronics and Telecommunication Engg, Government College of
More informationSPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes
SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,
More informationMODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS
MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,
More informationPerformance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment
BABU et al: VOICE ACTIVITY DETECTION ALGORITHM FOR ROBUST SPEECH RECOGNITION SYSTEM Journal of Scientific & Industrial Research Vol. 69, July 2010, pp. 515-522 515 Performance analysis of voice activity
More informationMOST MODERN automatic speech recognition (ASR)
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 5, SEPTEMBER 1997 451 A Model of Dynamic Auditory Perception and Its Application to Robust Word Recognition Brian Strope and Abeer Alwan, Member,
More information24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY /$ IEEE
24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY 2009 Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation Jiucang Hao, Hagai
More informationEmanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas
Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor Presented by Amir Kiperwas 1 M-element microphone array One desired source One undesired source Ambient noise field Signals: Broadband Mutually
More informationRobustness (cont.); End-to-end systems
Robustness (cont.); End-to-end systems Steve Renals Automatic Speech Recognition ASR Lecture 18 27 March 2017 ASR Lecture 18 Robustness (cont.); End-to-end systems 1 Robust Speech Recognition ASR Lecture
More informationSpeech Enhancement using Wiener filtering
Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing
More informationSpeech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech
Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech Project Proposal Avner Halevy Department of Mathematics University of Maryland, College Park ahalevy at math.umd.edu
More informationSpeech Enhancement Techniques using Wiener Filter and Subspace Filter
IJSTE - International Journal of Science Technology & Engineering Volume 3 Issue 05 November 2016 ISSN (online): 2349-784X Speech Enhancement Techniques using Wiener Filter and Subspace Filter Ankeeta
More informationAdaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks
Australian Journal of Basic and Applied Sciences, 4(7): 2093-2098, 2010 ISSN 1991-8178 Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks 1 Mojtaba Bandarabadi,
More informationDERIVATION OF TRAPS IN AUDITORY DOMAIN
DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.
More informationCascaded Noise-Shaping Modulators for Oversampled Data Conversion
Cascaded Noise-Shaping Modulators for Oversampled Data Conversion Bruce A. Wooley Stanford University B. Wooley, Stanford, 2004 1 Outline Oversampling modulators for A/D conversion Cascaded noise-shaping
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/
More informationDesign of Over GIGA bit Wireless LSI systems
Design of Over GIGA bit Wireless LSI systems Yoshikazu Miyanaga Hokkaido University Laboratory of Information Communication Networks Graduate School of Information Science and Technology Sapporo 060-0814,
More informationAutonomous Vehicle Speaker Verification System
Autonomous Vehicle Speaker Verification System Functional Requirements List and Performance Specifications Aaron Pfalzgraf Christopher Sullivan Project Advisor: Dr. Jose Sanchez 4 November 2013 AVSVS 2
More informationPerformance Analysis of Acoustic Echo Cancellation in Sound Processing
2016 IJSRSET Volume 2 Issue 3 Print ISSN : 2395-1990 Online ISSN : 2394-4099 Themed Section: Engineering and Technology Performance Analysis of Acoustic Echo Cancellation in Sound Processing N. Sakthi
More informationTopic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Topic Spectrogram Chromagram Cesptrogram Short time Fourier Transform Break signal into windows Calculate DFT of each window The Spectrogram spectrogram(y,1024,512,1024,fs,'yaxis'); A series of short term
More informationSIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM
SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM MAY 21 ABSTRACT Although automatic speech recognition systems have dramatically improved in recent decades,
More informationResearch Article DOA Estimation with Local-Peak-Weighted CSP
Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 21, Article ID 38729, 9 pages doi:1.11/21/38729 Research Article DOA Estimation with Local-Peak-Weighted CSP Osamu
More informationThe Application of System Generator in Digital Quadrature Direct Up-Conversion
Communications in Information Science and Management Engineering Apr. 2013, Vol. 3 Iss. 4, PP. 192-19 The Application of System Generator in Digital Quadrature Direct Up-Conversion Zhi Chai 1, Jun Shen
More informationSIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS
SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationMFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM
www.advancejournals.org Open Access Scientific Publisher MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM ABSTRACT- P. Santhiya 1, T. Jayasankar 1 1 AUT (BIT campus), Tiruchirappalli, India
More informationA VLSI Design of a Tomlinson-Harashima Precoder for MU-MIMO Systems Using Arrayed Pipelined Processing
2114 IEICE TRANS. FUNDAMENTALS, VOL.E96 A, NO.11 NOVEMBER 2013 PAPER Special Section on Smart Multimedia & Communication Systems A VLSI Design of a Tomlinson-Harashima Precoder for MU-MIMO Systems Using
More informationA Simple Two-Microphone Array Devoted to Speech Enhancement and Source Tracking
A Simple Two-Microphone Array Devoted to Speech Enhancement and Source Tracking A. Álvarez, P. Gómez, R. Martínez and, V. Nieto Departamento de Arquitectura y Tecnología de Sistemas Informáticos Universidad
More information