A LPC-PEV Based VAD for Word Boundary Detection

Size: px
Start display at page:

Download "A LPC-PEV Based VAD for Word Boundary Detection"

Transcription

1 14 A LPC-PEV Based VAD for Word Boundary Detection Syed Abbas Ali (A), NajmiGhaniHaider (B) and Mahmood Khan Pathan (C) (A) Faculty of Computer &Information Systems Engineering, N.E.D University of Engg. & Tech., Karachi- Pakistan.. saaj@neduet.edu.pk (B) Faculty of Computer Science & Information Technology, N.E.D University of Engg. & Tech., Karachi- Pakistan. najmi@neduet.edu.pk (C) Faculty of Computer Science & Information Technology, N.E.D University of Engg. & Tech., Karachi- Pakistan. mkpathan@hotmail.com Abstract The speech is the prominent mode of communication and interaction among human being and machines. In most of the practical application of ASR, the input speech is contaminated by ambient noise. In this paper LPC-PEV algorithm is proposed for word boundary detection in the presence of noise. In the proposed algorithm, the variance of error is estimated by AR model using Yule Walker method. We used the speech and noise samples taken from TIDIGIT corpus and NOISE-92 databases respectively, and evaluated the performance of system using LPC-PEV and LPC residual error. The experimental results based on numerical values of the samples clearly evident that the boundary detection rate of proposed algorithm is far better than the LPC residual error in term of clean and noisy speech. Index Terms Predictive capability, Linear prediction coefficient, precision, Residual error, Prediction error variance (PEV). V I. INTRODUCTION AD is widely used to identify the presence of speech in an input signal bymarking the boundaries of speech and non-speech segments. Speech recognition systemcan be considerably improved by integrating a VAD module into the system. VAD is a classification problem in which features of the audio signal are used to separate the input speech and nonspeech. The end points accuracy is one of the major factors in recognition performance. Good end point detection leads to effective computation. Also, it leads to accurate recognition since proper end points will result in good alignment for model making and template comparison [1]. Aprecise VAD reduces the computation cost and response time of speech recognition systems and only detected speechframes are allowed to passinto the recognition algorithm. Conversely, accurate detection of the end points of the speech in the presence of ambient noise avoids the waste of speech recognition system evaluation on ensuing silence [2]. The noise reduction task can be improvedby removing the unwanted portion of the speech signal. Feature extraction is an essential VAD module andone of the dominant approaches of feature extraction is temporal domain. Thelinear prediction analysis is the best example of temporal domain.lpc based algorithm performs well in high-level Gaussian-like ambient noise, short pulses, transient pulses and low level noises. LPC parametersdescribe the spectral peaksby forming a perceptually attractive description of the spectral envelope. LP model models the human vocal tract as an infinite response that produces the speech signal. LPC are coefficients of an AR model of speech frame and the all pole representation of the vocal tract transfer function. LP is a well-known method for finding all-pole model parameters, LP spectral envelopes overemphasize and overestimate the high and low pitch (voiced) speech spectral power, thuscontaining unwanted sharp contours, and don t increase in spectral envelope modeling performance as order of the filter increased. LP model assumed that speech modeled as the output of an all pole filter and excitation to this filter is single impulse andrandom noise sequence. In real scenario these assumptions are not exactly valid for observed speech signal especially for voiced portion and not also valid for voicedspeech, where the excitation source is a pulse train of certain pitch period. As a result, the speech samples in the nearest region of pitch pulses are not anticipated well and the residual error is relatively high in the locality of the pitch pulse and LPC coefficients are estimated by autocorrelation or covariance methods contain a certain amount of error when compared to actual value. This error is called prediction error. Exploiting this error we propose a statistical approach which is used to monitor the variance of the prediction error. The proposed LPC-PEV measures the prediction capability of LP model and measures the precision of LP model predictions, which prevents overestimates and overemphasizes issues in speech spectral power due to high pitch voiced and medium.this precision value will assist classifier to efficiently improve the boundary detection rate of the spoken word in the presence of noise.rest of the paper is organized as follows. In section II, explain the LPC coefficients computation used for LPC analysis using autocorrelation methods. Proposed LPC-PEV algorithm framework is presented in section III. In section IV, we discuss the VAD module using proposed LPC-PEV algorithm. We present our experimental results and discussion in section V. Finally, the conclusion is drawn in section VI.

2 15 II. LPCAND RESIDUAL ERROR ANALYSIS LPC family among the speech recognition approaches is well known for its performance and effective way of estimating the main parameters of speech signal. The parameters estimation method of an autoregressive model for speech is well established in [3]. For our discussion,consider a voiced speech signal {S (n) = s 1, s 2.s N } having N samples which can be modeled as a summation of periodic signals. S () = c Cos(ω k + ) (1) Where ω the pitch frequency of the signal is, c are the amplitudes at the harmonics and L is the no. of harmonics. L=, where f is typically 8 khz for speech. In LPC analysis work under the assumption that the current sample is predicted approximately by a linear combination of p past samples, S () = a S () (2) Where a 1, a 2 a p are the predictor coefficients, p is the order of the LPC analysis which is assumed to be 10 for speech samples at 8 khz. The predictor coefficients a k are calculated by reducing the sum of squared differences over the finite interval between the real speech samples and the linearly predicted ones. The error between the real and the predicted sample value is called the residual error and is given by e S () S () e = S () ( a S () ) (3) The power spectrum envelope will be approximately flat, due to low value of short-term correlation between samples of residual signal envelope of its power spectrum will be approximately flat. By taking Ztransform of (3), E(z)= A(z)S(z) (4) A(z)=1+ a Z (5) Where A(z) is the output of an LP FIR filter which eliminates the short-term correlation present in the speech signal which levels the spectrum. E(z) and S(Z) are the residual signal and speech signals in z-domain respectively. It is actually the error signal between the current sample values of the input and a predicted value based on p past sample values. E(z) has an approximately flat spectrum, LPC analysis modeled the short-time power-spectral envelope of the speech signal using an autoregressive model. H(z) = 1/A(z) (6) The LPC coefficients are computed from the speech sample and can beobtained by reducing the total-squared residual error. = (7) The summation range of (7) based on the two methods namely, autocorrelation and covariance,which are used for LPC analysis. We used the autocorrelation method in our experiments. In the autocorrelation method, short term LP analysis can be achieved using windowing the speech signal. In this analysis, the hamming window function is chosen over rectangular window function by assuming the samples outside this window consider being zero. In (7) error minimization criterion leads us to following equations: = 1 i p, (8) = (9) is the window function, N is thesamples duration and(8) called the Yule-Walker equation [4]. The matrix forms of the equations are: = (10) (11) =[,,., ] (12) =[,,., ] (13) Autocorrelation matrix in (11) based on toeplitz structure facilitates the solution of Yule Walker equation in (8) and (10) for LP coefficient by using Levinson-Durbin [5] and Schur[6] algorithms. Theeplitz structure also assure that the poles of the LP synthesis filterh(z) resulting from autocorrelation method should be inside the unit circle to maintain the stability of the LP synthesis filterh(z). III. VARIANCE OF THE PREDICTION ERROR LPC analysis works under the assumption that speech signal can be modeled as theoutput of an all-pole filter H(z) in (6).Excitation to this filter value is assumed to be either voiced speech or unvoiced speech (for single impulse or a white random noise respectively).in a real practice above mentioned assumptions are not exactly valid for observed speech signal. As a result LP coefficient projected by means of autocorrelation method contains a certain amount of error which is called Prediction error. Due to this prediction error near pitch pulses the speech samples are not predicted well and the prediction error is relatively large in the region of these pitch pulses [7], which affects the estimation accuracy of LPC analysis [8]. The Levinson-Durbin algorithm can be used to compute the filter coefficients a k of the LP method that minimizes the MSE. For voiced signal, the LP spectrum () can be observed as a spectral envelope whose samples provide an estimation of the voiced speech power [3]. () = (14) P EV is the variance of the prediction error. From the speech model discussed in (1), the speech signal has a correlation sequence; U ()= ( ) (15) The power spectrum shows a discrete line spectrum with the powers at the exponential frequenciesω k, i.e. ()= [(+ )+( )] (16) The variance of prediction error P EV is the Mean Squared Error (MSE) of the output of the filter A(z) and is given as; = ( ) (ω)dω = A(e ) (17) In general, variance of the prediction error P EV is a very

3 International Journal of Electrical & Computer Sciences IJECS IJECS-IJENS IJENS Vol: 12 No: 02 0 useful way to examine the predictive capability of a model. It provides a degree of the precision of a model's predictions.lp is a well-known known method for obtaining all all-pole model parameter. LP spectral envelopes over emphasize and overestimate the high and medium pitch voiced speech spectral power, thus featuring the unwanted sharp contours the spectral envelope modeling performance decreases as the order of filter increases[9]. IV. LPC-PEV BASED VAD MODULE 16 points detection algorithms are widely used as preprocessing of the sound waves in order to cut off the unwanted portion of speech present at the beginning and end of the speech sample. The VAD decision is normally based on feature vectors. For VAD decision, ision, we assume that the speech and noise are additive in nature. VAD normally consists of the following modules: Feature extraction, Start and end point decisions and decision smoothening. In our experiment, algorithm that is used in VAD to detect the end nd points of spoken word is LPCLPC PEV. VAD module includes end points detection of speech. E End Fig.11 Flow Diagram of LPC-PEV based VAD module Fig.1provides the overview of the LPC-PEV PEV based VAD module. Pre-emphasis emphasis is done to flat the signal spectrally. A first order high pass FIR filter is used for emphasis of the higher frequency components. For this purpose, a single coefficient digital filter with coefficient values is used. Spectral normalization is performed to compensate the distortion effect in the speech signal generated by linear convolution, which may be a result of filtering by the human vocal tract, room acoustics and the transfer fu function of aninputdevice like microphone. The spectral subtraction is done to cancel out any stationary noise in the input speech signal. It is a commonly used technique, and hence it is taken from the Voice box toolbox for MatLab. Framing is done to decompose pose the speech signal into a series of overlapping frames. Both the algorithms are designed for a frame length of 10 milliseconds. The sampling frequency is 8 KHz, therefore, there will be 10m*8k = 80 samples per frame. Also, it is better to use overlapping ng windows to ensure better temporal continuity in the transform domain; therefore, 75% overlapping rate is used. The Auto Regression model parameter was initially selected as 10 and a smoothening order as 20. We then estimated the AR all pole model using YuleWalker method [10]. The end point decision module (start and end) defines the rules for assigning speech or silence class to the LPC-PEV PEV features extracted in the previous block. The start and endpoint decisions are based on thresholds which are different erent for each speech sample. In order to smooth the decision curve, a moving average is computed of the LPCLPC PEV values. This is taken as the final decision curve. The decision smoothening is done to make the algorithm robust against noise. Also, some hang-over over algorithms use this smoothed decision curve in order to recover speech that is masked by the ambient noise [11]. V. EXPERIMENTAL RESULTS Experiments were done for word boundary detection in the presence of noise using two feature extraction techniques; LPC residual error (en)and the proposed LPC-PEV LPC (PEV). The comparison is done using MatLab tool version 10.0.MatLab performs computationally expensive tasks much faster than the other Conventional programming languages (e.g C and C++).We used speech processing ssing toolbox (Voice ( box tool) consisting of various MatLab routines. The speech database used in the experiments contains the isolated digits in English language taken from TIDIGIT Corpus [12]. The TIDIGITS corpus consists of more than 25 IJENS

4 17 thousand digit sequences spoken by over 300 men, women, and children. The data was collected in a quiet studio environment and digitized at 20 khz. In our experiment,we used 10 utterances of each digit from 0-9 in an approximately clean environment with a sampling frequency of 8 khz. Various types of noises such as babble, white noise, pink noise and Volvo (car) were collected from NOISEX-92 noise-inspeech database [13,14]. Two algorithms were developed in MatLab for comparison; one extracts the LPC-PEV from the input speech signal, whereas the other gives its boundary decisions on the basis of Residual error. The algorithm uses frame length of 10 milliseconds and 75% overlapping rate. The auto regression model parameters was initially selected as 10 and a smothering order 20, then we estimated the AR all pole model using Yule-Walker method. In this work, we have investigated both VAD based algorithms with clean isolated digit from 0-9 and isolated digit(0-9) in the presence of babble noise, white noise, pink noise and car (Volvo) noise.in this paper, we selected clean isolated digit zero and the same digit with white noise for graphical representation. In Fig.2 and Fig.3, graphical results represent the detected word based on LPC residual error algorithm with clean isolated digit zero and isolated digit zero in the presence of white noise respectively. Similarly in Fig.4 and Fig.5 represent the detected word based on LPC-PEV algorithm with clean isolated digit zero and isolated digit zero in the presence of white noise respectively. The red vertical lines show the start and end point positions in terms of samples of detected isolated digit. The waveforms are of the isolated digit 0. Duration of the speech is seconds. The curve with varying values show the LPC-PEV and the LPC residual error in their respective graphs, whereas the smoothed curve shows the decision curve on which the boundary decisions are made. The bottom green horizontal line indicatesthe lower threshold whereas the upper blue horizontal line indicates the upper threshold. Fig.3 LPC residual error algorithm (Digit zero with white noise) Fig.4 LPC-PEV algorithm (clean isolated digit zero ) Fig.5 LPC PEV algorithm (Digit zero with white noise) Fig.2 LPC residual error algorithm (clean isolated digit zero ) From the above figures we can make the comparison of endpoint detection based on LPC residual error and LPC-PEV, when the isolated digit is clean and in the presence of white noise. It can be seen that the detected boundaries of the word

5 18 segment using LPC-PEV indicates very successful endpoint detection in case of both clean and noisy speech as compared to the other technique. Table.1 to Table.10 provides the comparative analysis of word boundary decision from digit 0 to digit 9 respectively. To develop this comparison of boundary decision, we used start and end boundary points where the speech signal actually begins and ends. The tables describe the numerical value of the start and end point of the isolated digit (0-9) in terms of sample. The boundary points in the presence of a number of background noises such as babble, car, white and pink are also observed. Table.1 Boundary Decision for digit zero Table.2 Boundary Decision for digit one Table.3 Boundary Decision for digit two

6 19 Table.4 Boundary Decision for digit three Table.5 Boundary Decision for digit four Table.6 Boundary Decision for digit five

7 20 Table.7 Boundary Decision for digit six Table.8 Boundary Decision for digit seven

8 21 Table.9 Boundary Decision for digit eight Table.10 Boundary Decision for digit nine The comparison has been made on the basis of thenumeric values of starting and ending point of the samples. The starting point should be closest to the spoken word sample and end point should be away from spoken word sample for avoiding the speech loss and accurate word detection. Our experimental results show that the numerical values of the starting and ending point closest to and away from the spoken word respectively. From the comparative analysis of the boundary decision for isolated digit clearly evident that the boundary detection rate of LPC-PEV is far better than LPC Residual error both in term of clean and noisy speech. The study presented in this paper based on the preconditioned speech and noise data bases. This reflects the limitation of the LPC-PEV based VAD, when the speech and noise are encounter in the real world environment. VI. CONCLUSION In this paper, we presented a LPC-PEV based VAD algorithm for word boundary detection in the presence of noise. LPC coefficients are estimated by autocorrelation method contain a certain amount of error when compared to actual value. This error is called prediction error. Exploiting this error we propose a statistical approach which is used to monitor the variance of the prediction error. The proposed LPC-PEV measures the prediction capability of LP model and measures the precision of LP model predictions, which prevents overestimates and overemphasizes issues in speech spectral power due to high pitch voiced and medium. Experiments have been performed with TIDIGIT corpus and NOISEX-92 speech and noise-in speech databases respectively. The performance of the system is evaluated using LPC-PEV and LPC Residual error based VAD algorithm. From the numeric values of starting and ending point of the samples, experimental results indicate that the boundary detection rate of proposed LPC-PEV is far better than LPC residual error in term of clean and noisy speech. From future research aspect, it will also be

9 22 interesting to study the LPC-PEV based VAD to improve the robustness of the speech recognition system and combine this feature with any margin based learning algorithm to improve the generalization capability of the acoustic model. REFERENCES [1] LingyunGu, Stephen A. Zahorian, A New Robust Algorithm for Isolated Word detection, Proc. ICASSP, [2] M. Karnjanadecha, Stephen A. Zoahorian, Signal Modeling for Isolated Word Recognition, Proc. ICASSP,pp ,1999. [3] J. Makhoul, Linear prediction: A tutorial review," Proc. of the IEEE, vol. 63, no. 4, pp , [4] S.M. Kay,Modern Spectral Estimation: Theory and Application. Englewood Cliffs, NJ: Prentice Hall, [5] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Englewood Cliffs, NJ:Prentice-Hall, [6] J. Schur, UberPotenzreihen, die in Innern des Einheitskreisesbeschranktsind," J. fuer diereine and AngewandteMathematiek, vol. 147, pp , [7] K.K.Paliwal, W.B. Klejin, Quantization of LPC parameters. [8] B.S. Atal, Linear prediction of speech - Recent advances with applications to speech analysis,"in Speech Recognition, D.R. Reddy, Ed. New York: Academic Press, pp , [9] Manohar. N,Murthi and B. D. Rao, All-Pole Modeling of Speech Based on the Minimum Variance Distortion less Response Spectrum, IEEE TransactiononSpeechand Audio Processing, vol. 8, no. 3, May [10] S.L. Marple, Digital Spectral Analysis with Applications. Englewood Cliffs, NJ: Prentice Hall, [11] L. Karray, A.Martin, Toward improving speech detection robustness for speech recognition in adverse environments, Speech Communication, no. 3, pp [12] [13] Spib.rice.edu/spib/select_noise. [14] A.P. Varga, H.J.M Steeneken, M. Tomlinson, D.Jones, "The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition", In Technical Report, DRA Speech Research Unit,1992.

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment BABU et al: VOICE ACTIVITY DETECTION ALGORITHM FOR ROBUST SPEECH RECOGNITION SYSTEM Journal of Scientific & Industrial Research Vol. 69, July 2010, pp. 515-522 515 Performance analysis of voice activity

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

Adaptive Filters Linear Prediction

Adaptive Filters Linear Prediction Adaptive Filters Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Slide 1 Contents

More information

AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS

AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS MrPMohan Krishna 1, AJhansi Lakshmi 2, GAnusha 3, BYamuna 4, ASudha Rani 5 1 Asst Professor, 2,3,4,5 Student, Dept

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Real time noise-speech discrimination in time domain for speech recognition application

Real time noise-speech discrimination in time domain for speech recognition application University of Malaya From the SelectedWorks of Mokhtar Norrima January 4, 2011 Real time noise-speech discrimination in time domain for speech recognition application Norrima Mokhtar, University of Malaya

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,

More information

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods Tools and Applications Chapter Intended Learning Outcomes: (i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System

Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System Performance Analysiss of Speech Enhancement Algorithm for Robust Speech Recognition System C.GANESH BABU 1, Dr.P..T.VANATHI 2 R.RAMACHANDRAN 3, M.SENTHIL RAJAA 3, R.VENGATESH 3 1 Research Scholar (PSGCT)

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

Epoch Extraction From Emotional Speech

Epoch Extraction From Emotional Speech Epoch Extraction From al Speech D Govind and S R M Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati Email:{dgovind,prasanna}@iitg.ernet.in Abstract

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Fundamentals of Time- and Frequency-Domain Analysis of Signal-Averaged Electrocardiograms R. Martin Arthur, PhD

Fundamentals of Time- and Frequency-Domain Analysis of Signal-Averaged Electrocardiograms R. Martin Arthur, PhD CORONARY ARTERY DISEASE, 2(1):13-17, 1991 1 Fundamentals of Time- and Frequency-Domain Analysis of Signal-Averaged Electrocardiograms R. Martin Arthur, PhD Keywords digital filters, Fourier transform,

More information

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA

Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA Vocal Command Recognition Using Parallel Processing of Multiple Confidence-Weighted Algorithms in an FPGA ECE-492/3 Senior Design Project Spring 2015 Electrical and Computer Engineering Department Volgenau

More information

Voiced/nonvoiced detection based on robustness of voiced epochs

Voiced/nonvoiced detection based on robustness of voiced epochs Voiced/nonvoiced detection based on robustness of voiced epochs by N. Dhananjaya, B.Yegnanarayana in IEEE Signal Processing Letters, 17, 3 : 273-276 Report No: IIIT/TR/2010/50 Centre for Language Technologies

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

NOISE ESTIMATION IN A SINGLE CHANNEL

NOISE ESTIMATION IN A SINGLE CHANNEL SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina

More information

Voice Activity Detection

Voice Activity Detection Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class

More information

ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE

ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE - @ Ramon E Prieto et al Robust Pitch Tracking ROUST PITCH TRACKIN USIN LINEAR RERESSION OF THE PHASE Ramon E Prieto, Sora Kim 2 Electrical Engineering Department, Stanford University, rprieto@stanfordedu

More information

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991

RASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991 RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response

More information

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Using RASTA in task independent TANDEM feature extraction

Using RASTA in task independent TANDEM feature extraction R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t

More information

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute

More information

Spectral analysis of seismic signals using Burg algorithm V. Ravi Teja 1, U. Rakesh 2, S. Koteswara Rao 3, V. Lakshmi Bharathi 4

Spectral analysis of seismic signals using Burg algorithm V. Ravi Teja 1, U. Rakesh 2, S. Koteswara Rao 3, V. Lakshmi Bharathi 4 Volume 114 No. 1 217, 163-171 ISSN: 1311-88 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Spectral analysis of seismic signals using Burg algorithm V. avi Teja

More information

CHAPTER 4 VOICE ACTIVITY DETECTION ALGORITHMS

CHAPTER 4 VOICE ACTIVITY DETECTION ALGORITHMS 66 CHAPTER 4 VOICE ACTIVITY DETECTION ALGORITHMS 4.1 INTRODUCTION New frontiers of speech technology are demanding increased levels of performance in many areas. In the advent of Wireless Communications

More information

Sound Synthesis Methods

Sound Synthesis Methods Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like

More information

Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition

Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition Perceptually Motivated Linear Prediction Cepstral Features for Network Speech Recognition Aadel Alatwi, Stephen So, Kuldip K. Paliwal Signal Processing Laboratory Griffith University, Brisbane, QLD, 4111,

More information

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to

More information

Speech/Music Change Point Detection using Sonogram and AANN

Speech/Music Change Point Detection using Sonogram and AANN International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change

More information

Pitch Period of Speech Signals Preface, Determination and Transformation

Pitch Period of Speech Signals Preface, Determination and Transformation Pitch Period of Speech Signals Preface, Determination and Transformation Mohammad Hossein Saeidinezhad 1, Bahareh Karamsichani 2, Ehsan Movahedi 3 1 Islamic Azad university, Najafabad Branch, Saidinezhad@yahoo.com

More information

A Survey and Evaluation of Voice Activity Detection Algorithms

A Survey and Evaluation of Voice Activity Detection Algorithms A Survey and Evaluation of Voice Activity Detection Algorithms Seshashyama Sameeraj Meduri (ssme09@student.bth.se, 861003-7577) Rufus Ananth (anru09@student.bth.se, 861129-5018) Examiner: Dr. Sven Johansson

More information

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of COMPRESSIVE SAMPLING OF SPEECH SIGNALS by Mona Hussein Ramadan BS, Sebha University, 25 Submitted to the Graduate Faculty of Swanson School of Engineering in partial fulfillment of the requirements for

More information

Speech Enhancement Based On Noise Reduction

Speech Enhancement Based On Noise Reduction Speech Enhancement Based On Noise Reduction Kundan Kumar Singh Electrical Engineering Department University Of Rochester ksingh11@z.rochester.edu ABSTRACT This paper addresses the problem of signal distortion

More information

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE 1602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 8, NOVEMBER 2008 Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE Abstract

More information

Signal Processing Toolbox

Signal Processing Toolbox Signal Processing Toolbox Perform signal processing, analysis, and algorithm development Signal Processing Toolbox provides industry-standard algorithms for analog and digital signal processing (DSP).

More information

Narrow-Band Interference Rejection in DS/CDMA Systems Using Adaptive (QRD-LSL)-Based Nonlinear ACM Interpolators

Narrow-Band Interference Rejection in DS/CDMA Systems Using Adaptive (QRD-LSL)-Based Nonlinear ACM Interpolators 374 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 52, NO. 2, MARCH 2003 Narrow-Band Interference Rejection in DS/CDMA Systems Using Adaptive (QRD-LSL)-Based Nonlinear ACM Interpolators Jenq-Tay Yuan

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. Department of Signal Theory and Communications. c/ Gran Capitán s/n, Campus Nord, Edificio D5

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING. Department of Signal Theory and Communications. c/ Gran Capitán s/n, Campus Nord, Edificio D5 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING Javier Hernando Department of Signal Theory and Communications Polytechnical University of Catalonia c/ Gran Capitán s/n, Campus Nord, Edificio D5 08034

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

A Comparative Study of Formant Frequencies Estimation Techniques

A Comparative Study of Formant Frequencies Estimation Techniques A Comparative Study of Formant Frequencies Estimation Techniques DORRA GARGOURI, Med ALI KAMMOUN and AHMED BEN HAMIDA Unité de traitement de l information et électronique médicale, ENIS University of Sfax

More information

651 Analysis of LSF frame selection in voice conversion

651 Analysis of LSF frame selection in voice conversion 651 Analysis of LSF frame selection in voice conversion Elina Helander 1, Jani Nurminen 2, Moncef Gabbouj 1 1 Institute of Signal Processing, Tampere University of Technology, Finland 2 Noia Technology

More information

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION

ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION ARTIFICIAL BANDWIDTH EXTENSION OF NARROW-BAND SPEECH SIGNALS VIA HIGH-BAND ENERGY ESTIMATION Tenkasi Ramabadran and Mark Jasiuk Motorola Labs, Motorola Inc., 1301 East Algonquin Road, Schaumburg, IL 60196,

More information

ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP

ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP ON-LINE LABORATORIES FOR SPEECH AND IMAGE PROCESSING AND FOR COMMUNICATION SYSTEMS USING J-DSP A. Spanias, V. Atti, Y. Ko, T. Thrasyvoulou, M.Yasin, M. Zaman, T. Duman, L. Karam, A. Papandreou, K. Tsakalis

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION Jian Li 1,2, Shiwei Wang 1,2, Renhua Peng 1,2, Chengshi Zheng 1,2, Xiaodong Li 1,2 1. Communication Acoustics Laboratory, Institute of Acoustics,

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Audio processing methods on marine mammal vocalizations

Audio processing methods on marine mammal vocalizations Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio http://labrosa.ee.columbia.edu Sound to Signal sound is pressure

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

Level I Signal Modeling and Adaptive Spectral Analysis

Level I Signal Modeling and Adaptive Spectral Analysis Level I Signal Modeling and Adaptive Spectral Analysis 1 Learning Objectives Students will learn about autoregressive signal modeling as a means to represent a stochastic signal. This differs from using

More information

Fundamental frequency estimation of speech signals using MUSIC algorithm

Fundamental frequency estimation of speech signals using MUSIC algorithm Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,

More information

EE 422G - Signals and Systems Laboratory

EE 422G - Signals and Systems Laboratory EE 422G - Signals and Systems Laboratory Lab 3 FIR Filters Written by Kevin D. Donohue Department of Electrical and Computer Engineering University of Kentucky Lexington, KY 40506 September 19, 2015 Objectives:

More information

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,

More information

HUMAN speech is frequently encountered in several

HUMAN speech is frequently encountered in several 1948 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 7, SEPTEMBER 2012 Enhancement of Single-Channel Periodic Signals in the Time-Domain Jesper Rindom Jensen, Student Member,

More information

Speech Enhancement Using a Mixture-Maximum Model

Speech Enhancement Using a Mixture-Maximum Model IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a Mixture-Maximum Model David Burshtein, Senior Member, IEEE, and Sharon Gannot, Member, IEEE

More information

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2 MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2 1 Electronics and Communication Department, Parul institute of engineering and technology, Vadodara,

More information

Biosignal filtering and artifact rejection. Biosignal processing, S Autumn 2012

Biosignal filtering and artifact rejection. Biosignal processing, S Autumn 2012 Biosignal filtering and artifact rejection Biosignal processing, 521273S Autumn 2012 Motivation 1) Artifact removal: for example power line non-stationarity due to baseline variation muscle or eye movement

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com

More information

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008 R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS Helsinki University of Technology Laboratory of Acoustics and Audio

More information

An Improved Voice Activity Detection Based on Deep Belief Networks

An Improved Voice Activity Detection Based on Deep Belief Networks e-issn 2455 1392 Volume 2 Issue 4, April 2016 pp. 676-683 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com An Improved Voice Activity Detection Based on Deep Belief Networks Shabeeba T. K.

More information

CG401 Advanced Signal Processing. Dr Stuart Lawson Room A330 Tel: January 2003

CG401 Advanced Signal Processing. Dr Stuart Lawson Room A330 Tel: January 2003 CG40 Advanced Dr Stuart Lawson Room A330 Tel: 23780 e-mail: ssl@eng.warwick.ac.uk 03 January 2003 Lecture : Overview INTRODUCTION What is a signal? An information-bearing quantity. Examples of -D and 2-D

More information

A Spectral Conversion Approach to Single- Channel Speech Enhancement

A Spectral Conversion Approach to Single- Channel Speech Enhancement University of Pennsylvania ScholarlyCommons Departmental Papers (ESE) Department of Electrical & Systems Engineering May 2007 A Spectral Conversion Approach to Single- Channel Speech Enhancement Athanasios

More information

Drum Transcription Based on Independent Subspace Analysis

Drum Transcription Based on Independent Subspace Analysis Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,

More information