SPEECH PARAMETERIZATION FOR AUTOMATIC SPEECH RECOGNITION IN NOISY CONDITIONS

Size: px
Start display at page:

Download "SPEECH PARAMETERIZATION FOR AUTOMATIC SPEECH RECOGNITION IN NOISY CONDITIONS"

Transcription

1 SPEECH PARAMETERIZATION FOR AUTOMATIC SPEECH RECOGNITION IN NOISY CONDITIONS Bojana Gajić Department o Telecommunications, Norwegian University o Science and Technology 7491 Trondheim, Norway gajic@tele.ntnu.no Kuldip K. Paliwal School o Microelectronic Engineering, Griith University Brisbane, QLD 4111, Australia K.Paliwal@me.gu.edu.au ABSTRACT This paper is concerned with increasing the robustness o automatic speech recognition systems (ASR) against additive bacground noise, by inding speech parameters that are less inluenced by changes in acoustic environments than the conventional ones. Inspired by the good robustness o auditory based speech parameterization methods, we compare the steps involved with those in the conventional methods rom the signal processing point o view. The use o dominant spectral requencies is believed to be an important reason or the superior robustness o the auditory based methods. A new speech parameterization method is described that is conceptually similar to auditory based methods, while retaining the low computational cost o the conventional methods. Evaluation on an ASR tas has shown that the new method outperormed the conventional methods in presence o various bacground noises. 1. INTRODUCTION State-o-the-art automatic speech recognition (ASR) systems are capable o achieving a very high recognition accuracy when tested in laboratory conditions. However, they usually experience a dramatic decrease in perormance when used in real-world applications. One o the main reasons or such a behavior is presence o bacground noise in the testing environment that has not been observed during system training. This problem becomes especially important or ASR on mobile devices, as the acoustic environment is constantly changing and cannot be accounted or during system training. One way to overcome this problem is to ind a speech parameterization that is invariant to changing acoustic environments. The most commonly used speech parameters are based on the energy inormation derived rom the short-term speech spectrum. However, the dominant spectral requencies are less inluenced by additive noise than the energy inormation. Thus, it is expected that the robustness o ASR systems could be improved i the dominant spectral requencies are eiciently incorporated into speech parameter vectors. The paper is organized as ollows. It starts with an overview o ASR systems in Section 2, and describes the robustness problem with possible solutions in Section 3. Section 4 summarizes the main processing steps involved in conventional and auditory based speech parameterization methods and describe a new method that combines the advantages o both classes o methods. An experimental study perormed to compare the perormance o the dierent parameterization methods on an ASR tas in various acoustic environments is described in Section 5. Finally, the major conclusions are summarized in section THE ASR SYSTEM The aim o automatic speech recognition (ASR) is to transorm a given spoen utterance into the corresponding transcription. A bloc diagram o an ASR system is shown in Figure 1. Beore the system can be used, it has to learn the characteristic speech patterns rom a large speech database with accompanying transcriptions. A set o stochastic models (hidden Marov models) is trained, each corresponding to one speech unit (or example phoneme). In addition, a lexicon is prepared to describe how the words are build up rom the basic speech units, as well as a language model describing the relationship between words. The models, lexicon and language model are then used to determine the most liely transcription o an incoming spoen utterance. The speech parameterization bloc is used to extract rom the speech waveorm the relevant inormation or discriminating between dierent speech sounds. The inormation is presented as a sequence o parameter vectors. This paper describes several dierent approaches to speech parameterization, and compares

2 Trans cription Model training Training database waveorm Parameterization Parameter vectors. Models bla bla Language model Recognition Recognition result "bla bla" Lexicon Figure 1: Bloc diagram o an ASR system their perormance on an ASR tas in various noisy conditions. 3. THE ROBUSTNESS PROBLEM Robustness o an ASR system is the system s ability to successully deal with dierent aspects o variability in the speech signal. Some o the common variabilities that occur in speech signals are listed below: Pronunciation variations between speaers depending on speaers voice characteristics, dialect, social class, etc. Pronunciation variations or a given speaer depending on mood, emotions, context, etc. Variations in the acoustic environment. Variations in the transmission channel. A number o techniques have been proposed to increase the robustness o ASR systems. Nevertheless, it still remains a major obstacle or reliable use o ASR technology in many real-world applications. As the mobile hand-held terminals become more common, the robustness against variations in the acoustic environment becomes increasingly important. Stateo-the-art ASR systems experience a dramatic perormance degradation when the acoustic environment diers rom the one observed in the training. In the ollowing, we list the major classes o approaches or overcoming this problem. Multiconditional training: The idea is to train a separate set o models or each bacground environment liely to occur during system use. For a given acoustic environment, the most liely set o models is then ound and used during the recognition process. Noise reduction: This approach is concerned with reducing the presence o noise in the speech signal beore it is sent to the recognizer. When the models are trained in noise-ree environments, this will reduce the mismatch between the input speech signal and the models. A most common approach is to apply noise spectral subtraction. Model compensation and adaptation: Instead o modiying the speech signal to better comply with the models, in this approach the models are changed according to the statistical characteristics o the noise to better comply with the noisy speech. Robust speech parameterization: The aim is to ind such a speech representation that is invariant to changes o the acoustic environment. Note that this approach diers rom the other approaches in that it does not require the nowledge o a particular acoustic environment during the use o the system. In the rest o this paper, we will ocus on this approach. 4. SPEECH PARAMETERIZATION This section starts with a summary o the major processing steps involved in conventional methods or speech parameterization. It proceeds by explaining the idea behind auditory based methods that have been shown to outperorm the conventional methods in noisy conditions. The major dierences between the two classes o methods are then explained rom the signal processing point o view. At the end, a new parameterization method is described, that combines the advantages o both conventional and auditory based methods Conventional Methods Conventional methods or speech parameterization are based on extracting the inormation rom the shortterm power spectrum o speech. The speech signal is divided into overlapping speech rames o 20-30ms length, as the speech signal can be regarded stationary on such a short intervals. The short-term power spectrum is estimated or each rame using either discrete Fourier transorm (DFT), ast Fourier transorm (FFT), ilter ban analysis or linear prediction analysis. The resulting spectral representation is usually

3 modiied by applying some auditory motivated processing. At the end, it is usual to perorm a decorrelation transormation, as this simpliies the recognition process. Mel-requency cepstrum coeicients (MFCC) are the most widely used speech parameters or ASR. Figure 2 illustrates the major processing steps involved in their computation. The short-term speech spec- S 1 () Filter ban S () DCT s(n) Spectrum estimation S() Energy log e S () N... e... parameter vector Figure 2: Illustration o MFCC computation trum is estimated using FFT. It is passed through a ilter ban consisting o overlapping triangular bandpass ilters uniormly distributed along the perceptually based mel-requency scale. The choice o the ilter ban is motivated by the nowledge on human hearing. A vector o subband log-energies is then computed and sent to a discrete cosine transorm (DCT) or decorrelation purposes. The resulting DCT coeicients, reerred to as MFCC, serve as a inal representation o the given speech rame. In the case o noisy speech, the subband energies get aected by noise, and the resulting speech representation diers rom the one or clean speech. Thus, i an ASR system is trained on clean speech, and used in noisy conditions, the mismatch can cause a large perormance degradation Auditory Based Methods Humans have a ascinating ability to recognize speech in noisy acoustic environments. Thus, there is a belie that the robustness o ASR systems could be considerably improved by simulating the processes in human auditory system. However, not all the processes in human speech recognition are well understood, and auditory based methods or speech parameterization have to rely on some heuristics. Probably the best nown auditory based parameters or ASR are so called Ensemble Interval Histograms (EIH) [1]. In this paper, we will present a slight modiication o these parameters reerred to as Zero Crossings with Pea Amplitudes (ZCPA) [2]. These parameters have been shown to outperorm both the EIH and all o the conventional parameterization methods in presence o additive noise. An illustration o the ZCPA method is shown in Figure 3. A s 1 (n) s (n) s N(n)... Zero crossing detector z i 1 z i z i 1 z i z i 1 i s(n) Filter ban z i+1 Pea detector p i zi 1 p i Histogram construction log p i DCT bin(i ) i z i... parameter vector Figure 3: Illustration o ZCPA computation rame o the given speech signal is passed through a ilter ban o bandpass ilters. The iltering is done in time domain. The resulting subband signals are sent to zero-crossing detectors. The interval between each pair o successive zero-crossings is measured together with the signal pea amplitude between the zero crossings. Then, the inverse intervals between successive zero crossings over all the subband signals are recorded in a histogram. Each histogram entry is weighted by the logarithm o the corresponding pea amplitude. Finally, the DCT is perormed or decorrelation purposes. Note that the ZCPA computation represents an alternative way o perorming spectral analysis. The inverse intervals between successive zero-crossings represent the instantaneous dominant requencies o the subband signal. The pea amplitudes, on the other hand, represent a measure o the instantaneous energy o the subband signal. The histogram bins containing the dominant requencies are increased by the

4 corresponding energy measures. Thus the resulting histogram represents an alternative representation o the signal spectrum. While the MFCC is based only on the subband energy computation, ZCPA eiciently combines the energy and dominant requency inormation. We believe that this dierence can be a part o the explanation or the ZCPA s superior perormance in noisy conditions. The dominant speech requencies are much less aected by the presence o additive noise than the subband energy measures. Thus, incorporation o the dominant requencies in the speech parameter vector can lead to increased robustness against additive noise. However, the ZCPA computation is prohibitively computationally expensive or use in practical ASR systems. This is due to time-domain processing and the need or heavy interpolation o the higher requency subband signals in order to obtain a precise zero-crossing locations Subband Spectral Centroid Histograms Motivated by the good noise robustness o the ZCPA parameters and the computational eiciency o the MFCC parameters, we searched or the possibility to design a new parameterization method, that would be more robust than MFCC, but have an acceptable computational cost. We believed that this tas could be achieved by inding a more computationally eicient method or incorporating the dominant requency inormation. In [3] it has been shown that Subband Spectral Centroids (SSC) are closely related to the dominant speech requencies. Using SSC as additional eatures to MFCC has been shown to increase the robustness o the ASR systems against additive noise [3, 4, 5, 6, 7]. We proposed a new ramewor or combining the SSC and subband energies through the construction o Subband Spectral Centroid Histograms (SSCH) [8, 9]. An illustration o the processing steps involved in the SSCH computation is shown in Figure 4. The speech power spectrum is estimated using FFT, and iltering is perormed in the requency domain to produce a number o subband signal. This part o the processing is analogue to the MFCC method. The dominant requency o each subband signal is estimated by the subband centroid. In addition, a subband energy measure is computed similarly as or the MFCC method. The dominant requency and energy inormation over all the subbands are combined in a single histogram in the same way as or the ZCPA method. Finally, the DCT is perormed or decorrelation purposes. This method uses the same conceptual inormation as the ZCPA method. However, note that the dominant requencies are now estimated rom the short- Spectrum estimation S() S 1 () s(n) S() Filter ban Centroid DCT S () Energy e S () e Histogram construction log p bin() N parameter vector Figure 4: Illustration o SSCH computation term power spectrum. This is a disadvantage in noisy conditions, as the spectrum itsel is corrupted by noise. On the other hand, the act that the processing is done in the spectral domain dramatically reduces the computational cost compared to ZCPA. It is now in the same order as or the MFCC computation. 5. EXPERIMENTAL STUDY This section describes an experimental study perormed to compare the perormance o the described methods on an ASR tas in various bacground conditions Tas and Database The methods were evaluated on the ISOLET Spoen Letter Database [10] down-sampled to 8 Hz. The database consists o English letters spoen in isolation recorded in a quiet room. Two repetitions o each word were recorded or each speaer. Utterances rom 90 speaers were used or training, while utterances rom 30 speaers were used or evaluation. Although the vocabulary consisting o 26 English letters is rather small, this is not a simple recognition tas, since the vocabulary words are very short and highly conusable. Noisy speech was artiicially created by adding to the original test set our dierent noise types at our dierent signal-to-noise ratios (SNR). Those are:

5 white Gaussian noise, actory noise, car noise and bacground speech. The last three noise types were taen rom the NOISEX database, where they were reerred to as actory1, volvo and babble noise respectively. A segment o the noise ile equal to the length o the speech ile was randomly extracted and added to the speech ile at the required SNR. SNR was computed as the ratio between the maximal rame energy o the speech ile, and the average energy o the noise segment. This way o computation maes SNR independent o the duration o the surrounding silence in the speech iles. Model training and recognition was perormed using speech recognition toolit HTK [11]. One hidden Marov model (HMM) with ive states and ive Gaussian mixtures per state was trained or each vocabulary word Choice o Free Parameters In the ollowing we summarize the most important parameters involved in MFCC, ZCPA and SSCH computation. MFCC: Frame length was set to 25 ms. The ilter ban consisted o 24 overlapping triangular ilters uniormly spaced along the mel-requency scale. 12 DCT coeicients were used. This is the standard parameter setting or the MFCC computation. It has not be optimized on the particular tas. ZCPA: The ilter ban consisted o 20 bandpass FIR ilters linearly spaced on the bar-requency scale (perceptually based requency scale similar to the mel-requency scale), with bandwidths equal to 2 Bar. The ilters had order 61, and were designed using the windowing method. Frequency dependent rame lengths equal to 20/ c were used, where c is the center requency o the corresponding bandpass ilter. The number o histogram bins was 26. Number o DCT coeicients was 12. SSCH: Frame length was set to 25 ms. The ilter ban consisted o 65 rectangular ilters. In the low requency range, ilter bandwidth was 300 Hz and the ilters were linearly spaced along the requency scale. In the high requency region, ilter bandwidth was 2 Bar and the ilters were linearly spaced along the bar-requency scale. 12 DCT coeicients were computed rom 26 histogram bins. Delta and delta-delta parameters were computed in addition to the static parameters or all o the methods, resulting in 36-dimensional parameter vectors Experimental Results Table 1 shows the results o the evaluation o MFCC, SSCH and ZCPA parameterization methods on both clean and noisy versions o the ISOLET database. Model training was perormed using clean speech. The recognition perormance was measured in terms o word accuracy. Table 1: Word accuracy or dierent parameterization methods in various acoustic environments a) White Gaussian noise method clean MFCC SSCH ZCPA b) Car noise method clean MFCC SSCH ZCPA c) Factory noise method clean MFCC SSCH ZCPA d) Bacground speech method clean MFCC SSCH ZCPA Looing at the results in Table 1, we see that MFCC perorms best on clean speech. However, even in presence o only a small amount o noise, the situation changes completely, and MFCC becomes the worst o the three methods. This conirms the lac o the robustness o MFCC parameters. SSCH is signiicantly more robust than MFCC or all the noise types. The improvement is largest or car noise, and smallest in presence o bacground speech. The relatively poor perormance in presence o bacground speech is probably due to the existence o speech-lie spectral peas in the bacground signal. SSCH even outperorms the ZCPA in the case o car noise, while ZCPA is more robust in presence o the other noise types. However, it is important to note that ZCPA cannot be used in place or SSCH in

6 practical applications, due to its prohibitive computational cost. 6. CONCLUSIONS In this paper, we addressed the robustness problem o the ASR systems against additive bacground noise. One way o overcoming this problem is to ind a speech parameterization that is less inluenced by additive noise than the conventional parameters. We compared the steps involved in conventional and auditory based methods, and concluded that the superior perormance o the auditory methods can be explained by the incorporation o the dominant spectral requencies into parameter vectors. A new speech parameterization method was described that computes the dominant spectral requencies in a more eicient way, rom the short-term spectrum o speech. Also this method outperormed the conventional methods in noisy conditions, conirming the importance o utilizing the dominant spectral requencies or increasing the robustness o the ASR systems. [9] B. Gajić and K. K. Paliwal, Robust parameters or speech recognition based on subband spectral centroid histograms, in Proc. EUROSPEECH, September [10] R. A. Cole, Y. K. Muthusamy, and M. Fanty, The ISOLET spoen letter database, Technical report CSE , Oregon Graduate Institute o Science and Technology, Beverton, OR, USA, March [11] S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev, and P. Woodland, The HTK Boo. Entropic, REFERENCES [1] O. Ghitza, Auditory models and human perormance in tass related to speech coding and speech recognition, IEEE Trans. on and Audio Processing, vol. 2, pp , January [2] D.-S. Kim, S.-Y. Lee, and R. M. Kil, Auditory processing o speech signals or robust speech recognition in real-world noisy environments, IEEE Trans. on and Audio Processing, vol. 7, pp , January [3] K. K. Paliwal, Spectral subband centroid eatures or speech recognition, in Proc. ICASSP, vol. 2, pp , May [4] S. Tsuge, T. Fuada, and H. Singer, Speaer normalized spectral subband parameters or noise robust speech recognition, in Proc. ICASSP, May [5] D. Albesano, R. D. Mori, R. Gemello, and F. Mana, A study o the eect o adding new dimensions to trajectories in the acoustic space, in Proc. EU- ROSPEECH, vol. 4, pp , September [6] R. D. Mori, D. Albesano, R. Gemello, and F. Mana, Ear-model derived eatures or automatic speech recognition, in Proc. ICASSP, [7] E. Gjelsvi, Modiication o ront-end processing or robust speech recognition. Diploma thesis, Norwegian University o Science and Technology, June [8] B. Gajić and K. K. Paliwal, Robust eature extraction using subband spectral centroid histograms, in Proc. ICASSP, May 2001.

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b

I D I A P. On Factorizing Spectral Dynamics for Robust Speech Recognition R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b R E S E A R C H R E P O R T I D I A P On Factorizing Spectral Dynamics for Robust Speech Recognition a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-33 June 23 Iain McCowan a Hemant Misra a,b to appear in

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Introduction to OFDM. Characteristics of OFDM (Orthogonal Frequency Division Multiplexing)

Introduction to OFDM. Characteristics of OFDM (Orthogonal Frequency Division Multiplexing) Introduction to OFDM Characteristics o OFDM (Orthogonal Frequency Division Multiplexing Parallel data transmission with very long symbol duration - Robust under multi-path channels Transormation o a requency-selective

More information

Modulation Spectrum Power-law Expansion for Robust Speech Recognition

Modulation Spectrum Power-law Expansion for Robust Speech Recognition Modulation Spectrum Power-law Expansion for Robust Speech Recognition Hao-Teng Fan, Zi-Hao Ye and Jeih-weih Hung Department of Electrical Engineering, National Chi Nan University, Nantou, Taiwan E-mail:

More information

I D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b

I D I A P. Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR R E S E A R C H R E P O R T. Iain McCowan a Hemant Misra a,b R E S E A R C H R E P O R T I D I A P Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR a Vivek Tyagi Hervé Bourlard a,b IDIAP RR 3-47 September 23 Iain McCowan a Hemant Misra a,b to appear

More information

Sinusoidal signal. Arbitrary signal. Periodic rectangular pulse. Sampling function. Sampled sinusoidal signal. Sampled arbitrary signal

Sinusoidal signal. Arbitrary signal. Periodic rectangular pulse. Sampling function. Sampled sinusoidal signal. Sampled arbitrary signal Techniques o Physics Worksheet 4 Digital Signal Processing 1 Introduction to Digital Signal Processing The ield o digital signal processing (DSP) is concerned with the processing o signals that have been

More information

ECE5984 Orthogonal Frequency Division Multiplexing and Related Technologies Fall Mohamed Essam Khedr. Channel Estimation

ECE5984 Orthogonal Frequency Division Multiplexing and Related Technologies Fall Mohamed Essam Khedr. Channel Estimation ECE5984 Orthogonal Frequency Division Multiplexing and Related Technologies Fall 2007 Mohamed Essam Khedr Channel Estimation Matlab Assignment # Thursday 4 October 2007 Develop an OFDM system with the

More information

Comparison of Spectral Analysis Methods for Automatic Speech Recognition

Comparison of Spectral Analysis Methods for Automatic Speech Recognition INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering

More information

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991 RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

MFCC-based perceptual hashing for compressed domain of speech content identification

MFCC-based perceptual hashing for compressed domain of speech content identification Available online www.jocpr.com Journal o Chemical and Pharmaceutical Research, 014, 6(7):379-386 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 MFCC-based perceptual hashing or compressed domain

More information

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium

More information

Determination of Pitch Range Based on Onset and Offset Analysis in Modulation Frequency Domain

Determination of Pitch Range Based on Onset and Offset Analysis in Modulation Frequency Domain Determination o Pitch Range Based on Onset and Oset Analysis in Modulation Frequency Domain A. Mahmoodzadeh Speech Proc. Research Lab ECE Dept. Yazd University Yazd, Iran H. R. Abutalebi Speech Proc. Research

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain {jordi.bonada,

Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain   {jordi.bonada, GENERATION OF GROWL-TYPE VOICE QUALITIES BY SPECTRAL MORPHING Jordi Bonada Merlijn Blaauw Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Email: {jordi.bonada, merlijn.blaauw}@up.edu

More information

Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques

Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques 81 Isolated Word Recognition Based on Combination of Multiple Noise-Robust Techniques Noboru Hayasaka 1, Non-member ABSTRACT

More information

Robust telephone speech recognition based on channel compensation

Robust telephone speech recognition based on channel compensation Pattern Recognition 32 (1999) 1061}1067 Robust telephone speech recognition based on channel compensation Jiqing Han*, Wen Gao Department of Computer Science and Engineering, Harbin Institute of Technology,

More information

A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR

A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR A STUDY ON CEPSTRAL SUB-BAND NORMALIZATION FOR ROBUST ASR Syu-Siang Wang 1, Jeih-weih Hung, Yu Tsao 1 1 Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan Dept. of Electrical

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and

More information

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping 100 ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005 Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping Naoya Wada, Shingo Yoshizawa, Noboru

More information

Time-Frequency Distributions for Automatic Speech Recognition

Time-Frequency Distributions for Automatic Speech Recognition 196 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 3, MARCH 2001 Time-Frequency Distributions for Automatic Speech Recognition Alexandros Potamianos, Member, IEEE, and Petros Maragos, Fellow,

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Auditory Based Feature Vectors for Speech Recognition Systems

Auditory Based Feature Vectors for Speech Recognition Systems Auditory Based Feature Vectors for Speech Recognition Systems Dr. Waleed H. Abdulla Electrical & Computer Engineering Department The University of Auckland, New Zealand [w.abdulla@auckland.ac.nz] 1 Outlines

More information

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients ISSN (Print) : 232 3765 An ISO 3297: 27 Certified Organization Vol. 3, Special Issue 3, April 214 Paiyanoor-63 14, Tamil Nadu, India Enhancement of Speech Signal by Adaptation of Scales and Thresholds

More information

International Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015

International Journal of Engineering and Techniques - Volume 1 Issue 6, Nov Dec 2015 RESEARCH ARTICLE OPEN ACCESS A Comparative Study on Feature Extraction Technique for Isolated Word Speech Recognition Easwari.N 1, Ponmuthuramalingam.P 2 1,2 (PG & Research Department of Computer Science,

More information

DARK CURRENT ELIMINATION IN CHARGED COUPLE DEVICES

DARK CURRENT ELIMINATION IN CHARGED COUPLE DEVICES DARK CURRENT ELIMINATION IN CHARGED COUPLE DEVICES L. Kňazovická, J. Švihlík Department o Computing and Control Engineering, ICT Prague Abstract Charged Couple Devices can be ound all around us. They are

More information

PDF hosted at the Radboud Repository of the Radboud University Nijmegen

PDF hosted at the Radboud Repository of the Radboud University Nijmegen PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this

More information

Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition

Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition Aadel Alatwi, Stephen So, Kuldip K. Paliwal Signal Processing Laboratory Griffith University, Brisbane, QLD, 4111,

More information

Speech and Music Discrimination based on Signal Modulation Spectrum.

Speech and Music Discrimination based on Signal Modulation Spectrum. Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a R E S E A R C H R E P O R T I D I A P Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a IDIAP RR 7-7 January 8 submitted for publication a IDIAP Research Institute,

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi

More information

AN EFFICIENT SET OF FEATURES FOR PULSE REPETITION INTERVAL MODULATION RECOGNITION

AN EFFICIENT SET OF FEATURES FOR PULSE REPETITION INTERVAL MODULATION RECOGNITION AN EFFICIENT SET OF FEATURES FOR PULSE REPETITION INTERVAL MODULATION RECOGNITION J-P. Kauppi, K.S. Martikainen Patria Aviation Oy, Naulakatu 3, 33100 Tampere, Finland, ax +358204692696 jukka-pekka.kauppi@patria.i,

More information

Audio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23

Audio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23 Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal

More information

Introduction of Audio and Music

Introduction of Audio and Music 1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,

More information

SPEECH ENHANCEMENT BASED ON ITERATIVE WIENER FILTER USING COMPLEX SPEECH ANALYSIS

SPEECH ENHANCEMENT BASED ON ITERATIVE WIENER FILTER USING COMPLEX SPEECH ANALYSIS SPEECH ENHANCEMENT BASED ON TERATVE WENER FLTER USNG COMPLEX SPEECH ANALYSS Keiichi Funaki Computing & Networking Center, Univ. o the Ryukyus Senbaru, Nishihara, Okinawa, 93-3, Japan phone: +(8)98-895-8946,

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems

More information

DERIVATION OF TRAPS IN AUDITORY DOMAIN

DERIVATION OF TRAPS IN AUDITORY DOMAIN DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.

More information

Calibration of Microphone Arrays for Improved Speech Recognition

Calibration of Microphone Arrays for Improved Speech Recognition MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present

More information

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department

More information

A Real Time Noise-Robust Speech Recognition System

A Real Time Noise-Robust Speech Recognition System A Real Time Noise-Robust Speech Recognition System 7 A Real Time Noise-Robust Speech Recognition System Naoya Wada, Shingo Yoshizawa, and Yoshikazu Miyanaga, Non-members ABSTRACT This paper introduces

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

Mikko Myllymäki and Tuomas Virtanen

Mikko Myllymäki and Tuomas Virtanen NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,

More information

A MATLAB Model of Hybrid Active Filter Based on SVPWM Technique

A MATLAB Model of Hybrid Active Filter Based on SVPWM Technique International Journal o Electrical Engineering. ISSN 0974-2158 olume 5, Number 5 (2012), pp. 557-569 International Research Publication House http://www.irphouse.com A MATLAB Model o Hybrid Active Filter

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Enhancing the Complex-valued Acoustic Spectrograms in Modulation Domain for Creating Noise-Robust Features in Speech Recognition

Enhancing the Complex-valued Acoustic Spectrograms in Modulation Domain for Creating Noise-Robust Features in Speech Recognition Proceedings of APSIPA Annual Summit and Conference 15 16-19 December 15 Enhancing the Complex-valued Acoustic Spectrograms in Modulation Domain for Creating Noise-Robust Features in Speech Recognition

More information

Fatigue Life Assessment Using Signal Processing Techniques

Fatigue Life Assessment Using Signal Processing Techniques Fatigue Lie Assessment Using Signal Processing Techniques S. ABDULLAH 1, M. Z. NUAWI, C. K. E. NIZWAN, A. ZAHARIM, Z. M. NOPIAH Engineering Faculty, Universiti Kebangsaan Malaysia 43600 UKM Bangi, Selangor,

More information

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic

More information

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS 46 CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS 3.1 INTRODUCTION Personal communication of today is impaired by nearly ubiquitous noise. Speech communication becomes difficult under these conditions; speech

More information

TIME-FREQUENCY ANALYSIS OF NON-STATIONARY THREE PHASE SIGNALS. Z. Leonowicz T. Lobos

TIME-FREQUENCY ANALYSIS OF NON-STATIONARY THREE PHASE SIGNALS. Z. Leonowicz T. Lobos Copyright IFAC 15th Triennial World Congress, Barcelona, Spain TIME-FREQUENCY ANALYSIS OF NON-STATIONARY THREE PHASE SIGNALS Z. Leonowicz T. Lobos Wroclaw University o Technology Pl. Grunwaldzki 13, 537

More information

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute

More information

Cepstrum alanysis of speech signals

Cepstrum alanysis of speech signals Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP

More information

High Speed Communication Circuits and Systems Lecture 10 Mixers

High Speed Communication Circuits and Systems Lecture 10 Mixers High Speed Communication Circuits and Systems Lecture Mixers Michael H. Perrott March 5, 24 Copyright 24 by Michael H. Perrott All rights reserved. Mixer Design or Wireless Systems From Antenna and Bandpass

More information

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure

More information

Solid State Relays & Its

Solid State Relays & Its Solid State Relays & Its Applications Presented By Dr. Mostaa Abdel-Geliel Course Objectives Know new techniques in relay industries. Understand the types o static relays and its components. Understand

More information

Noise Removal from ECG Signal and Performance Analysis Using Different Filter

Noise Removal from ECG Signal and Performance Analysis Using Different Filter International Journal o Innovative Research in Electronics and Communication (IJIREC) Volume. 1, Issue 2, May 214, PP.32-39 ISSN 2349-442 (Print) & ISSN 2349-45 (Online) www.arcjournal.org Noise Removal

More information

SEG/San Antonio 2007 Annual Meeting. Summary. Morlet wavelet transform

SEG/San Antonio 2007 Annual Meeting. Summary. Morlet wavelet transform Xiaogui Miao*, CGGVeritas, Calgary, Canada, Xiao-gui_miao@cggveritas.com Dragana Todorovic-Marinic and Tyler Klatt, Encana, Calgary Canada Summary Most geologic changes have a seismic response but sometimes

More information

IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH

IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH RESEARCH REPORT IDIAP IMPROVING MICROPHONE ARRAY SPEECH RECOGNITION WITH COCHLEAR IMPLANT-LIKE SPECTRALLY REDUCED SPEECH Cong-Thanh Do Mohammad J. Taghizadeh Philip N. Garner Idiap-RR-40-2011 DECEMBER

More information

651 Analysis of LSF frame selection in voice conversion

651 Analysis of LSF frame selection in voice conversion 651 Analysis of LSF frame selection in voice conversion Elina Helander 1, Jani Nurminen 2, Moncef Gabbouj 1 1 Institute of Signal Processing, Tampere University of Technology, Finland 2 Noia Technology

More information

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to

More information

Detection and direction-finding of spread spectrum signals using correlation and narrowband interference rejection

Detection and direction-finding of spread spectrum signals using correlation and narrowband interference rejection Detection and direction-inding o spread spectrum signals using correlation and narrowband intererence rejection Ulrika Ahnström,2,JohanFalk,3, Peter Händel,3, Maria Wikström Department o Electronic Warare

More information

Improving ASR performance on PDA by contamination of training data

Improving ASR performance on PDA by contamination of training data Improving ASR performance on PDA by contamination of training data Christophe Ris and Laurent Couvreur Multitel & FPMS-TCTS, Avenue Copernic, B-7 Mons, Belgium ris,couvreur@multitel.be Abstract Automatic

More information

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment BABU et al: VOICE ACTIVITY DETECTION ALGORITHM FOR ROBUST SPEECH RECOGNITION SYSTEM Journal of Scientific & Industrial Research Vol. 69, July 2010, pp. 515-522 515 Performance analysis of voice activity

More information

Optimizing Reception Performance of new UWB Pulse shape over Multipath Channel using MMSE Adaptive Algorithm

Optimizing Reception Performance of new UWB Pulse shape over Multipath Channel using MMSE Adaptive Algorithm IOSR Journal o Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 05, Issue 01 (January. 2015), V1 PP 44-57 www.iosrjen.org Optimizing Reception Perormance o new UWB Pulse shape over Multipath

More information

Gammatone Cepstral Coefficient for Speaker Identification

Gammatone Cepstral Coefficient for Speaker Identification Gammatone Cepstral Coefficient for Speaker Identification Rahana Fathima 1, Raseena P E 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala, India 1 Asst. Professor, Ilahia

More information

Audio Restoration Based on DSP Tools

Audio Restoration Based on DSP Tools Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

Machine recognition of speech trained on data from New Jersey Labs

Machine recognition of speech trained on data from New Jersey Labs Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation

More information

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com

More information

Traditional Analog Modulation Techniques

Traditional Analog Modulation Techniques Chapter 5 Traditional Analog Modulation Techniques Mikael Olosson 2002 2007 Modulation techniques are mainly used to transmit inormation in a given requency band. The reason or that may be that the channel

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement 1 Zeeshan Hashmi Khateeb, 2 Gopalaiah 1,2 Department of Instrumentation

More information

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array 2012 2nd International Conference on Computer Design and Engineering (ICCDE 2012) IPCSIT vol. 49 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V49.14 Simultaneous Recognition of Speech

More information

A multi-class method for detecting audio events in news broadcasts

A multi-class method for detecting audio events in news broadcasts A multi-class method for detecting audio events in news broadcasts Sergios Petridis, Theodoros Giannakopoulos, and Stavros Perantonis Computational Intelligence Laboratory, Institute of Informatics and

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Spread-Spectrum Technique in Sigma-Delta Modulators

Spread-Spectrum Technique in Sigma-Delta Modulators Spread-Spectrum Technique in Sigma-Delta Modulators by Eric C. Moule Submitted in Partial Fulillment o the Requirements or the Degree Doctor o Philosophy Supervised by Proessor Zeljko Ignjatovic Department

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Implementation of an Intelligent Target Classifier with Bicoherence Feature Set

Implementation of an Intelligent Target Classifier with Bicoherence Feature Set ISSN: 39-8753 International Journal o Innovative Research in Science, (An ISO 397: 007 Certiied Organization Vol. 3, Issue, November 04 Implementation o an Intelligent Target Classiier with Bicoherence

More information

Audio Fingerprinting using Fractional Fourier Transform

Audio Fingerprinting using Fractional Fourier Transform Audio Fingerprinting using Fractional Fourier Transform Swati V. Sutar 1, D. G. Bhalke 2 1 (Department of Electronics & Telecommunication, JSPM s RSCOE college of Engineering Pune, India) 2 (Department,

More information

A new zoom algorithm and its use in frequency estimation

A new zoom algorithm and its use in frequency estimation Waves Wavelets Fractals Adv. Anal. 5; :7 Research Article Open Access Manuel D. Ortigueira, António S. Serralheiro, and J. A. Tenreiro Machado A new zoom algorithm and its use in requency estimation DOI.55/wwaa-5-

More information

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004

More information

Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis

Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis International Journal of Scientific and Research Publications, Volume 5, Issue 11, November 2015 412 Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis Shalate

More information

Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method

Single channel speech separation in modulation frequency domain based on a novel pitch range estimation method RESEARCH Open Access Single channel speech separation in modulation requency domain based on a novel pitch range estimation method Azar Mahmoodzadeh 1, Hamid Reza Abutalebi 1*, Hamid Soltanian-Zadeh 2,3

More information

Robustness (cont.); End-to-end systems

Robustness (cont.); End-to-end systems Robustness (cont.); End-to-end systems Steve Renals Automatic Speech Recognition ASR Lecture 18 27 March 2017 ASR Lecture 18 Robustness (cont.); End-to-end systems 1 Robust Speech Recognition ASR Lecture

More information

ECEN 5014, Spring 2013 Special Topics: Active Microwave Circuits and MMICs Zoya Popovic, University of Colorado, Boulder

ECEN 5014, Spring 2013 Special Topics: Active Microwave Circuits and MMICs Zoya Popovic, University of Colorado, Boulder ECEN 5014, Spring 2013 Special Topics: Active Microwave Circuits and MMICs Zoya Popovic, University o Colorado, Boulder LECTURE 13 PHASE NOISE L13.1. INTRODUCTION The requency stability o an oscillator

More information

Global Design Analysis for Highly Repeatable Solid-state Klystron Modulators

Global Design Analysis for Highly Repeatable Solid-state Klystron Modulators CERN-ACC-2-8 Davide.Aguglia@cern.ch Global Design Analysis or Highly Repeatable Solid-state Klystron Modulators Anthony Dal Gobbo and Davide Aguglia, Member, IEEE CERN, Geneva, Switzerland Keywords: Power

More information

MOST MODERN automatic speech recognition (ASR)

MOST MODERN automatic speech recognition (ASR) IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 5, NO. 5, SEPTEMBER 1997 451 A Model of Dynamic Auditory Perception and Its Application to Robust Word Recognition Brian Strope and Abeer Alwan, Member,

More information

Lousy Processing Increases Energy Efficiency in Massive MIMO Systems

Lousy Processing Increases Energy Efficiency in Massive MIMO Systems 1 Lousy Processing Increases Energy Eiciency in Massive MIMO Systems Sara Gunnarsson, Micaela Bortas, Yanxiang Huang, Cheng-Ming Chen, Liesbet Van der Perre and Ove Edors Department o EIT, Lund University,

More information

SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM

SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM SIGNAL PROCESSING FOR ROBUST SPEECH RECOGNITION MOTIVATED BY AUDITORY PROCESSING CHANWOO KIM MAY 21 ABSTRACT Although automatic speech recognition systems have dramatically improved in recent decades,

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Frequency-Foldback Technique Optimizes PFC Efficiency Over The Full Load Range

Frequency-Foldback Technique Optimizes PFC Efficiency Over The Full Load Range ISSUE: October 2012 Frequency-Foldback Technique Optimizes PFC Eiciency Over The Full Load Range by Joel Turchi, ON Semiconductor, Toulouse, France Environmental concerns lead to new eiciency requirements

More information

Speech Enhancement Using a Mixture-Maximum Model

Speech Enhancement Using a Mixture-Maximum Model IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a Mixture-Maximum Model David Burshtein, Senior Member, IEEE, and Sharon Gannot, Member, IEEE

More information