ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE

Size: px
Start display at page:

Download "ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE"

Transcription

1 Ramon E Prieto et al Robust Pitch Tracking ROUST PITCH TRACKIN USIN LINEAR RERESSION OF THE PHASE Ramon E Prieto, Sora Kim 2 Electrical Engineering Department, Stanford University, rprieto@stanfordedu 2 Electrical Engineering Department, Stanford University, kimsora@stanfordedu ASTRACT: A new robust algorithm for estimating the tch period of a speech signal is dected This algorithm puts emphasis on frequency components where noise spectral leakage have less impact on the signal At the same time, it uses smaller analysis windows to improve time resolution avoid jitter tch doubling effects As a result, experiments show lower fine tch errors besides better voiced unvoiced segment detection INTRODUCTION The robust estimation of the tch period plays an important role on speech processing applications Many methods to extract the tch have been proposed Some of the most widely accepted methods are the Cepstrum method ( the SIFT method In the first method jitter tch doubling are shown to be a problem The accuracy of the second method depends on how stationary is the speech signal in the analysis interval Other methods like the autocorrelation function method the average magnitude difference function method keep the formant structure of the signal, making the tch estimation hard when there are high energy high frequency harmonics in the signal The objective of this paper is to develop a tch tracking algorithm that can overcome the problems of the algorithms described above On one h, avoid the effect of high energy high frequency harmonics On the other h, avoid the use of long analysis windows to avoid tch doubling jitter effects This objectives would directly result in lower gross error counts lower fine tch errors (2 For this, we use a time delay estimation technique described in section 2, where we provide a new theoretical framework that explains why this method should work when spectral leakage is an issue In sections 4 we provide robustness to our method by improving the phase unwrapng process by weighting the contribution of the phase of the different frequency bins to the tch estimation In section we evaluate the performance of our method with both clean noisy speech We also compare our method with the cepstrum method ( the autocorrelation method ( LINEAR RERESSION OF THE PHASE Let s assume two different frames of a sampled voiced speech signal: "! *,+ / : *;=< 7>? A@ "! (!$&( (2!$&( where is periodic with fourier coefficients C is the tch period in units of time (D must be a multiple of the sampling period E represents, in units of time, how far is the beginning of frame from the start of the tch period that is next to the beginning off is assumed to range from to H A very similar problem is stated in (6, where they aim to find the time delay 9I J between The objective of this work is to find, given that we know,j the time delay between F : 9K Pitch Synchronous Frames M M 4A 4A N; <7> Assuming that both frames are tch synchronous (ie L then: S!!$T$ ( Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 26

2 Ramon E Prieto et al Robust Pitch Tracking where M M are the DFT coefficients off respectively We can see that the unwrapped phase of formula is bilinear in Then, if we knowf, their time delay we can do a linear regression of that unwrapped phase vs The result gives us which gives us the value of Although this is an unrealistic case ( is known before calculating formula it illustrates the basics of our tch tracking method Non Pitch Synchronous Frames In our real problem, we don t know T we wish to calculate it In the present section we will see that applying a linear regression of the phase when the frames are not tch synchronous still gives us the period or a good estimate We will also suggest which modifications we have to apply to the basic method in order to make it more reliable We wish to calculate applying a linear regression to the unwrapped phase of: D : M M D D (4 "!! T( where q is the frequency bin Formula doesn t apply in this case Then, analyzing the DFTs of implies a spectral leakage effect iven the formulas 2, the DFTs of are: M D :,+ 4 /2 D M D,+ /FK 4A ;=< O D ( S!!$T$ D :? /F ; + < > Q N; + Q (6 Figure a shows the magnitude of the contribution of harmonic D M to C when C The location of the frequency bins of q depends on the values of We can see that the ideal case of no interference of harmonic k in the frequency bins far away from happens when (tch synchronous case As the difference between is very small, D will be close to N for the frequency bin closest to close to zero for the others Figure b shows the phase of D Making the same analysis as in a, the phase added to the contribution of D M in D will be close to zero for the frequency bins close to when the difference between > is very small That phase can be very high (even higher than when the difference between is around > If the phase distortion of a harmonic in our non-tch synchronous analysis can be as high as, what guaranties that phase regression will work for non-tch synchronous frames? The answer is given by Phase Interpolation Subtraction of Phases Phase Interpolation Imagine we have two harmonics, k, k+ Imagine also that those two harmonics are conflicting in the frequency bin "! $ & iven the characteristics of the side lobes of ( ( in Figure 4 C : * ( :,+ 4 ( a, the contribution of any other harmonics is assumed to be small As such, from formula, we have: M D D : ( : &( ; 98 ;=<- ( 7 & 7 C ;: = C / : ;=<- :2 46 (7 R <: &$ (8 (9 7 We can see from formula 7 that, if 7 > M are properly unwrapped then the phase of D 7 is an interpolation of the phases 7 > The weights of that interpolation are driven by ( 4 (, Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 27

3 Ramon E Prieto et al Robust Pitch Tracking Phase [rad] Magnitude a Magnitude N*k*Ts/T N/2 N*k*Ts/T N*k*Ts/T + N/2 b Phase /2 /2 N*k*Ts/T N/2 N*k*Ts/T N*k*Ts/T + N/2 Magnitude 2 b Magnitude of two neighboring harmonics Frecuency bin q Figure : a Magnitude of centered in The circles show the magnitude at the frequency bins when b Phase of frequency bins in the same case as a c Magnitude of (dotted (dashed when Also DFT of the sum of harmonics! " " (solid upscaled by 2 Phase [rad] Phase [rad] Phase [rad] /2 /2 2 /2 4 2 a c e Freq [KHz] /2 /2 /2 /2 b d f Freq [KHz] Figure 2: Comparison of the three unwrapng methods a Phase of $ using no Unwrapng b us- Phase of $ using U Unwrapng c Phase of $ using SFU Unwrapng, & d Phase of $ ing LRSFU Unwrapng, e Phase of $ using SFU Unwrapng, f Phase of $ LRSFU Unwrapng, ( 4 (, If the weights are the same for both phases (ie ( 4A ( ( 4H ( M, then, the interpolated phase eliminates the phase distortion the resulting phase of D will be - E - E & M The same will happen to D where the phase will be E - - E & > & ( *, perfect situation, since the frequency bin q is adding zero error to the linear regression As the difference between, ( C ( ( 4: ( is small, M we will have small phase distortion However, if the difference is big enough so the resulting phase of D 7 tends to be more 7 than & or the opposite, the solution of that problem is going to be given by Subtraction of Phases using Subtraction of Phases This time lets assume that in formula 7, the difference between or the difference between ( C ( ( 4S ( makes the resulting phase to M 7 be closer to 7 rather than & (the opposite case works just the same Then the resulting phases of D M D are: 7 M D 7, : 98 7 M D R : 98 ( &$ ;: &( ;: 7 D R 7 M D R 7 M D R -+/ ( allowing our regression method to work with some regression error, since the frequency bin q next to the harmonics k k+ are showing a phase that is proportional to k The analysis above uses a rectangular window For the rest of this work we will use hamming windows since the amplitude of the worst-case side lobe level will be lower The analysis done in this section can be generalized to hamming windows Even though Phase Interpolation Subtraction of Phases would solve the phase distortion problem for the frequency bins closest to a harmonic, there is still the problem of high energy harmonics contributing to the phase of frequency bins far away from the harmonic itself This problem can seriously modify the result of the regression method This also tells us that the frequency bins with more energy have a more reliable phase The solution to this problem is solved by using a weighted linear regression Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 28

4 ( : Ramon E Prieto et al Robust Pitch Tracking PHASE UNWRAPPIN Work has been done on the field of phase Unwrapng Phase unwrapng has been used to to calculate the Complex Cepstrum Several methods have been proposed to unwrap the phase of one dimensional signals among which we preferred to compare the following ones: asic Unwrapng (U If we consider the phase response as a continuous function of frequency, then unwrapng is meant to make the phase more continuous As such our asic Unwrapng method (U adds or or greater than respectively the phase of all the frequency bins greater or equal than q if the difference between the phase of the frequency bins q q- is lower than Slope Forced Unwrapng (SFU iven that the phase of the frequency bins closest to the first harmonic wont wrap unless ( ( is bigger than approximately, we can consider those phases as good information to calculate an initial slope Then, at frequency bin, we calculate the slope of the line that departs from frequency bin zero to frequency bin An estimate of the phase at will be calculated using that slope, the actual phase at frequency bin will be unwrapped around that estimate Since we want only reliable frequency bins to modify the estimated slope, the slope will be recalculated only in the frequency bins where the magnitude is greater or equal than times the maximum magnitude in the spectrum Linear Regression Slope Forced Unwrapng (LRSFU The most widely used method for phase unwrapng is (4, a less general version of it was implemented in ( According to (, for intermediate estimate at frequency bin, frequency bins to are used to perform a linear regression The calculated slope is used to predict an estimate of the phase ( of frequency bin, unwrapng the actual phase around that estimate We call this method Linear Regression Slope Forced Unwrapng (LRSFU The value was used in the same way as in SFU A comparison of the unwrapng methods is in figure 2 As an example we show the results of two frames separated by exactly a tch period that, as a result, should give a slope equal to zero Parts b c show that the U SFU methods are too sensible to spectral leakage For this specific example, LRSFU, with is the most robust method since it is the only one that didn t add in any bin From figure 2 is important to see how dramatic the change in the slope would be if our method is not robust enough We can also see in all the unwrapng methods of figure 2 that there is no incorrect unwrap of the initial frequency bins with high magnitude in the DFT (speech usually has high energy until -4KHz When doing the linear regression, if we put more weight on the frequencies with high amplitude, we would be reducing the effect of spectral leakage not avoided by the unwrapng method WEIHTED LINEAR RERESSION We want to apply a linear regression to the unwrapped phase of formula 4 The problem solution are stated as: ( ( where W is a NxN diagonal matrix with the weights as the diagonal elements Q is a vector containing the frequency bin indexes H to N, is the vector containing the unwrapped phase of each of the frequency bins of formula 4 is the regression error The work in (6 uses the magnitude squared coherence function as defined in (7 to define a weighting scheme However, since come from the same microphone, the magnitude squared coherence Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 29 ( 8 > to (2

5 Ramon E Prieto et al Robust Pitch Tracking T (ms Error T (ms a U j T+delta (ms Error d g 2 b SFU e h k T+delta (ms c LRSFU f i l T+delta (ms Figure : Estimated Regression Errors vs for different weighting schemes unwrapng methods a, U, no weighting b, SFU, no weighting c, LRSFU, no weighting d Reg Error, U, no weighting e Reg Error, SFU, no weighting f Reg Error, LRSFU, no weighting g, U, h, SFU, i, LRSFU, j Reg Error, U, k Reg Error, SFU, l Reg Error, LRSFU, function will give a strong correlation of the noise In (8, they prefilter the signal to emphasize the frequencies where the signal-to-noise ratio is high Following this reasoning section we propose the following weighting scheme: K (M D ( ( vs $ vs are where is a real number greater than one to emphasize the frequencies with high amplitude over the ones with low amplitude In figure, several plots of the estimated tch shown The actual tch period of the signal is 7ms If we use weighting we can see that becomes more reliable in a bigger region of ( that the regression error becomes a discriminant between a good estimate a bad estimate of the tch period RESULTS are below We can see from parts i l of figure that we can perform several iterations of our method fixing the position of framef, shifting the time delay to the last estimated until certain thresholds This method is what we call Iterative Linear Regressions of the Phase (ILRP To approximate the method to the ideal tch synchronous case, we implement a variation of ILRP where we set the frame length off J their time delay to be equal to the last tch period found at each iteration This method is called Adaptive Frame Length Iterative Linear Regression of the Phase (AFLILRP it is applied only after the first tch period has been successfully found by ILRP This variation avoids jitter tch doubling effects allows the use of a lower value A frame will be labeled as voiced if went below the thresholds before a maximum number of iterations Otherwise, the frame will be labeled as unvoiced For the results in this section we used 64 seconds of speech among male speakers 96 seconds of speech among 2 female speakers Table shows the performance measure in each row for the different phase unwrapng methods in each column Number sts for SFU 2 sts for LRSFU For example, method 2- means LRSFU in ILRP SFU in AFLILRP We also compared the performance of our method with the Cepstrum tch detection method ( the Autocorrelation method ( We used for both SFU LRSFU The performance measures used are gross tch error (PE, voiced-unvoiced error rate (V-UV, unvoiced-voiced error rate (UV-V, gross error count (EC, fine tch errors average (FPEAV fine tch errors stard deviation (FPESD, as defined in (2 Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 26

6 Ramon E Prieto et al Robust Pitch Tracking Table : Performance Of The Pitch Estimation For Different Phase Unwrapng Methods In ILRP And AFLILRP Male Data, Clean Speech Measure ceps ac PE( V-UV( UV-V( EC( FPEAV(ms FPESD(ms Female Data, Clean Speech ceps ac Male Data, SNR = db Measure 2-2 Cepstrum Autocorr V-UV( 84 7 UV-V( EC( FPEAV(ms 27 2 FPESD(ms 4 Female Data, SNR = db Measure 2-2 Cepstrum Autocorr V-UV( UV-V( 7 8 EC( FPEAV(ms FPESD(ms For clean speech, in terms of V-UV UV-V, 2-2 performs the best for male data, while it performs almost the same as -2 cepstrum for female data In terms of EC, FPEAV FPESD, are the best perform almost the same However, 2-2 is faster more efficient in finding out if a segment is voiced or unvoiced For Noisy data at db SNR, 2-2 performs clearly better than cepstrum 2-2 performs considerably better than autocorrelation in the UV-V, EC FPESD measures The high UV-V measure in the autocorrelation method makes it hard to make a comparison regarding V-UV PE CONCLUSIONS We have described a method that uses a time delay estimation technique phase information to detect the tch frequency of a speech signal We brought a new theoretical explanation as to why this method should work, we have described different approaches of phase unwrapng to come with a robust fast finding of the tch We have proposed to eliminate the contribution of unreliable low energy phase components by making a weighted linear regression of the phase As a result, compared to cepstrum autocorrelation, method 2-2 performs better always in terms of EC FPESD, while it performs similarly or better in the rest of the measures depending on if the data is from male or female for both clean db SNR speech REFERENCES [] AM Noll, Cepstrum Pitch Determination, J Acoust Soc America Vol 4, pp 29-9, 967 [2] LR Rabiner, MJ Cheng, AE Rosenberg, CA Mcgonegal, A comparative performance study of several tch detection algorithms, [] Secrest, R Doddington, An integrated tch tracking algorithm for speech systems, Proc IEEE Int Conf Acoustics, Speech, Signal Processing, 98, pp 2- [4] JM Tribolet, A new phase unwrapng algorithm, in IEEE Trans Acoust, Speech, Signal Processing, vol ASSP-2, pp 7-77, 977 [] Michael S rstein, John E Adcock, Harvey F Silverman, A Practical Time-Delay Estimator for Localizing Speech Sources with a Microphone Array, Computer, Speech, Language, April 99, pp -69 [6] Y Chan, R Hattin, J Plant, The Least Squares Estimation of Time Delay Its Use in Signal Detection, IEEE Trans Acoust, Speech, Signal Processing, vol ASSP-24, pp , Jun 978 [7] C Carter, C H Knapp, A H Nutall, Estimation of the Magnitude-Squared Function Via Overlapped Fast Fourier Transform Processing, IEEE Transactions Audio Electroacustics, pp 7-44, Aug 97 [8] C H Knapp, C Carter, The eneralized Correlation Method for Estimation of Time Delay, IEEE Trans Acoust, Speech, Signal Processing, vol ASSP-24, pp 2-26, Aug 976 Melbourne, December 2 to, 22 c Australian Speech Science & Technology Association Inc Accepted after full review page 26

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

NCCF ACF. cepstrum coef. error signal > samples

NCCF ACF. cepstrum coef. error signal > samples ESTIMATION OF FUNDAMENTAL FREQUENCY IN SPEECH Petr Motl»cek 1 Abstract This paper presents an application of one method for improving fundamental frequency detection from the speech. The method is based

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium

More information

Discrete Fourier Transform (DFT)

Discrete Fourier Transform (DFT) Amplitude Amplitude Discrete Fourier Transform (DFT) DFT transforms the time domain signal samples to the frequency domain components. DFT Signal Spectrum Time Frequency DFT is often used to do frequency

More information

Hungarian Speech Synthesis Using a Phase Exact HNM Approach

Hungarian Speech Synthesis Using a Phase Exact HNM Approach Hungarian Speech Synthesis Using a Phase Exact HNM Approach Kornél Kovács 1, András Kocsor 2, and László Tóth 3 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Correspondence. Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm

Correspondence. Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 3, MAY 1999 333 Correspondence Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm Sassan Ahmadi and Andreas

More information

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

Robust Low-Resource Sound Localization in Correlated Noise

Robust Low-Resource Sound Localization in Correlated Noise INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Introduction of Audio and Music

Introduction of Audio and Music 1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,

More information

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis Mohini Avatade & S.L. Sahare Electronics & Telecommunication Department, Cummins

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research

Improving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research Improving Meetings with Microphone Array Algorithms Ivan Tashev Microsoft Research Why microphone arrays? They ensure better sound quality: less noises and reverberation Provide speaker position using

More information

Real-Time Digital Hardware Pitch Detector

Real-Time Digital Hardware Pitch Detector 2 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-24, NO. 1, FEBRUARY 1976 Real-Time Digital Hardware Pitch Detector JOHN J. DUBNOWSKI, RONALD W. SCHAFER, SENIOR MEMBER, IEEE,

More information

Voice Activity Detection for Speech Enhancement Applications

Voice Activity Detection for Speech Enhancement Applications Voice Activity Detection for Speech Enhancement Applications E. Verteletskaya, K. Sakhnov Abstract This paper describes a study of noise-robust voice activity detection (VAD) utilizing the periodicity

More information

NOISE ESTIMATION IN A SINGLE CHANNEL

NOISE ESTIMATION IN A SINGLE CHANNEL SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

Transcription of Piano Music

Transcription of Piano Music Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Measurement of RMS values of non-coherently sampled signals. Martin Novotny 1, Milos Sedlacek 2

Measurement of RMS values of non-coherently sampled signals. Martin Novotny 1, Milos Sedlacek 2 Measurement of values of non-coherently sampled signals Martin ovotny, Milos Sedlacek, Czech Technical University in Prague, Faculty of Electrical Engineering, Dept. of Measurement Technicka, CZ-667 Prague,

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,

More information

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY INTER-NOISE 216 WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY Shumpei SAKAI 1 ; Tetsuro MURAKAMI 2 ; Naoto SAKATA 3 ; Hirohumi NAKAJIMA 4 ; Kazuhiro NAKADAI

More information

Epoch Extraction From Emotional Speech

Epoch Extraction From Emotional Speech Epoch Extraction From al Speech D Govind and S R M Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati Email:{dgovind,prasanna}@iitg.ernet.in Abstract

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Voiced/nonvoiced detection based on robustness of voiced epochs

Voiced/nonvoiced detection based on robustness of voiced epochs Voiced/nonvoiced detection based on robustness of voiced epochs by N. Dhananjaya, B.Yegnanarayana in IEEE Signal Processing Letters, 17, 3 : 273-276 Report No: IIIT/TR/2010/50 Centre for Language Technologies

More information

Phase estimation in speech enhancement unimportant, important, or impossible?

Phase estimation in speech enhancement unimportant, important, or impossible? IEEE 7-th Convention of Electrical and Electronics Engineers in Israel Phase estimation in speech enhancement unimportant, important, or impossible? Timo Gerkmann, Martin Krawczyk, and Robert Rehr Speech

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS Helsinki University of Technology Laboratory of Acoustics and Audio

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,

More information

Fundamental frequency estimation of speech signals using MUSIC algorithm

Fundamental frequency estimation of speech signals using MUSIC algorithm Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,

More information

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure

More information

REpeating Pattern Extraction Technique (REPET)

REpeating Pattern Extraction Technique (REPET) REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure

More information

Implementing Speaker Recognition

Implementing Speaker Recognition Implementing Speaker Recognition Chase Zhou Physics 406-11 May 2015 Introduction Machinery has come to replace much of human labor. They are faster, stronger, and more consistent than any human. They ve

More information

Keywords: spectral centroid, MPEG-7, sum of sine waves, band limited impulse train, STFT, peak detection.

Keywords: spectral centroid, MPEG-7, sum of sine waves, band limited impulse train, STFT, peak detection. Global Journal of Researches in Engineering: J General Engineering Volume 15 Issue 4 Version 1.0 Year 2015 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech

More information

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)

More information

Monophony/Polyphony Classification System using Fourier of Fourier Transform

Monophony/Polyphony Classification System using Fourier of Fourier Transform International Journal of Electronics Engineering, 2 (2), 2010, pp. 299 303 Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye

More information

ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION. Frank Kurth, Alessia Cornaggia-Urrigshardt and Sebastian Urrigshardt

ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION. Frank Kurth, Alessia Cornaggia-Urrigshardt and Sebastian Urrigshardt 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION Frank Kurth, Alessia Cornaggia-Urrigshardt

More information

Improved Detection by Peak Shape Recognition Using Artificial Neural Networks

Improved Detection by Peak Shape Recognition Using Artificial Neural Networks Improved Detection by Peak Shape Recognition Using Artificial Neural Networks Stefan Wunsch, Johannes Fink, Friedrich K. Jondral Communications Engineering Lab, Karlsruhe Institute of Technology Stefan.Wunsch@student.kit.edu,

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Speech Enhancement Using a Mixture-Maximum Model

Speech Enhancement Using a Mixture-Maximum Model IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 6, SEPTEMBER 2002 341 Speech Enhancement Using a Mixture-Maximum Model David Burshtein, Senior Member, IEEE, and Sharon Gannot, Member, IEEE

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis

Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis International Journal of Scientific and Research Publications, Volume 5, Issue 11, November 2015 412 Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis Shalate

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Princeton ELE 201, Spring 2014 Laboratory No. 2 Shazam

Princeton ELE 201, Spring 2014 Laboratory No. 2 Shazam Princeton ELE 201, Spring 2014 Laboratory No. 2 Shazam 1 Background In this lab we will begin to code a Shazam-like program to identify a short clip of music using a database of songs. The basic procedure

More information

Envelope Modulation Spectrum (EMS)

Envelope Modulation Spectrum (EMS) Envelope Modulation Spectrum (EMS) The Envelope Modulation Spectrum (EMS) is a representation of the slow amplitude modulations in a signal and the distribution of energy in the amplitude fluctuations

More information

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION Jian Li 1,2, Shiwei Wang 1,2, Renhua Peng 1,2, Chengshi Zheng 1,2, Xiaodong Li 1,2 1. Communication Acoustics Laboratory, Institute of Acoustics,

More information

An Efficient Pitch Estimation Method Using Windowless and Normalized Autocorrelation Functions in Noisy Environments

An Efficient Pitch Estimation Method Using Windowless and Normalized Autocorrelation Functions in Noisy Environments An Efficient Pitch Estimation Method Using Windowless and ormalized Autocorrelation Functions in oisy Environments M. A. F. M. Rashidul Hasan, and Tetsuya Shimamura Abstract In this paper, a pitch estimation

More information

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B. www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 4 April 2015, Page No. 11143-11147 Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

ADDITIVE synthesis [1] is the original spectrum modeling

ADDITIVE synthesis [1] is the original spectrum modeling IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 851 Perceptual Long-Term Variable-Rate Sinusoidal Modeling of Speech Laurent Girin, Member, IEEE, Mohammad Firouzmand,

More information

A variable step-size LMS adaptive filtering algorithm for speech denoising in VoIP

A variable step-size LMS adaptive filtering algorithm for speech denoising in VoIP 7 3rd International Conference on Computational Systems and Communications (ICCSC 7) A variable step-size LMS adaptive filtering algorithm for speech denoising in VoIP Hongyu Chen College of Information

More information

Variable Step-Size LMS Adaptive Filters for CDMA Multiuser Detection

Variable Step-Size LMS Adaptive Filters for CDMA Multiuser Detection FACTA UNIVERSITATIS (NIŠ) SER.: ELEC. ENERG. vol. 7, April 4, -3 Variable Step-Size LMS Adaptive Filters for CDMA Multiuser Detection Karen Egiazarian, Pauli Kuosmanen, and Radu Ciprian Bilcu Abstract:

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

Live multi-track audio recording

Live multi-track audio recording Live multi-track audio recording Joao Luiz Azevedo de Carvalho EE522 Project - Spring 2007 - University of Southern California Abstract In live multi-track audio recording, each microphone perceives sound

More information

A Three-Microphone Adaptive Noise Canceller for Minimizing Reverberation and Signal Distortion

A Three-Microphone Adaptive Noise Canceller for Minimizing Reverberation and Signal Distortion American Journal of Applied Sciences 5 (4): 30-37, 008 ISSN 1546-939 008 Science Publications A Three-Microphone Adaptive Noise Canceller for Minimizing Reverberation and Signal Distortion Zayed M. Ramadan

More information

ENF PHASE DISCONTINUITY DETECTION BASED ON MULTI-HARMONICS ANALYSIS

ENF PHASE DISCONTINUITY DETECTION BASED ON MULTI-HARMONICS ANALYSIS U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 4, 2015 ISSN 2286-3540 ENF PHASE DISCONTINUITY DETECTION BASED ON MULTI-HARMONICS ANALYSIS Valentin A. NIŢĂ 1, Amelia CIOBANU 2, Robert Al. DOBRE 3, Cristian

More information

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas

Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor Presented by Amir Kiperwas 1 M-element microphone array One desired source One undesired source Ambient noise field Signals: Broadband Mutually

More information

Real-time fundamental frequency estimation by least-square fitting. IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p.

Real-time fundamental frequency estimation by least-square fitting. IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p. Title Real-time fundamental frequency estimation by least-square fitting Author(s) Choi, AKO Citation IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p. 201-205 Issued Date 1997 URL

More information

/$ IEEE

/$ IEEE 614 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 4, MAY 2009 Event-Based Instantaneous Fundamental Frequency Estimation From Speech Signals B. Yegnanarayana, Senior Member,

More information

Windows Connections. Preliminaries

Windows Connections. Preliminaries Windows Connections Dale B. Dalrymple Next Annual comp.dsp Conference 21425 Corrections Preliminaries The approach in this presentation Take aways Window types Window relationships Windows tables of information

More information

Speech and Music Discrimination based on Signal Modulation Spectrum.

Speech and Music Discrimination based on Signal Modulation Spectrum. Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we

More information

Separating Voiced Segments from Music File using MFCC, ZCR and GMM

Separating Voiced Segments from Music File using MFCC, ZCR and GMM Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.

More information

Using sound levels for location tracking

Using sound levels for location tracking Using sound levels for location tracking Sasha Ames sasha@cs.ucsc.edu CMPE250 Multimedia Systems University of California, Santa Cruz Abstract We present an experiemnt to attempt to track the location

More information

Distortion products and the perceived pitch of harmonic complex tones

Distortion products and the perceived pitch of harmonic complex tones Distortion products and the perceived pitch of harmonic complex tones D. Pressnitzer and R.D. Patterson Centre for the Neural Basis of Hearing, Dept. of Physiology, Downing street, Cambridge CB2 3EG, U.K.

More information

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping

Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping 100 ECTI TRANSACTIONS ON ELECTRICAL ENG., ELECTRONICS, AND COMMUNICATIONS VOL.3, NO.2 AUGUST 2005 Robust Speech Feature Extraction using RSF/DRA and Burst Noise Skipping Naoya Wada, Shingo Yoshizawa, Noboru

More information

Midterm Examination CS 534: Computational Photography

Midterm Examination CS 534: Computational Photography Midterm Examination CS 534: Computational Photography November 3, 2015 NAME: SOLUTIONS Problem Score Max Score 1 8 2 8 3 9 4 4 5 3 6 4 7 6 8 13 9 7 10 4 11 7 12 10 13 9 14 8 Total 100 1 1. [8] What are

More information

ON THE RELATIONSHIP BETWEEN INSTANTANEOUS FREQUENCY AND PITCH IN. 1 Introduction. Zied Mnasri 1, Hamid Amiri 1

ON THE RELATIONSHIP BETWEEN INSTANTANEOUS FREQUENCY AND PITCH IN. 1 Introduction. Zied Mnasri 1, Hamid Amiri 1 ON THE RELATIONSHIP BETWEEN INSTANTANEOUS FREQUENCY AND PITCH IN SPEECH SIGNALS Zied Mnasri 1, Hamid Amiri 1 1 Electrical engineering dept, National School of Engineering in Tunis, University Tunis El

More information

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Single Channel Speaker Segregation using Sinusoidal Residual Modeling NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology

More information

Drum Transcription Based on Independent Subspace Analysis

Drum Transcription Based on Independent Subspace Analysis Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,

More information

Signal Processing for Digitizers

Signal Processing for Digitizers Signal Processing for Digitizers Modular digitizers allow accurate, high resolution data acquisition that can be quickly transferred to a host computer. Signal processing functions, applied in the digitizer

More information

A system for automatic detection and correction of detuned singing

A system for automatic detection and correction of detuned singing A system for automatic detection and correction of detuned singing M. Lech and B. Kostek Gdansk University of Technology, Multimedia Systems Department, /2 Gabriela Narutowicza Street, 80-952 Gdansk, Poland

More information

A Faster Method for Accurate Spectral Testing without Requiring Coherent Sampling

A Faster Method for Accurate Spectral Testing without Requiring Coherent Sampling A Faster Method for Accurate Spectral Testing without Requiring Coherent Sampling Minshun Wu 1,2, Degang Chen 2 1 Xi an Jiaotong University, Xi an, P. R. China 2 Iowa State University, Ames, IA, USA Abstract

More information

Audio Restoration Based on DSP Tools

Audio Restoration Based on DSP Tools Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract

More information

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Seare H. Rezenom and Anthony D. Broadhurst, Member, IEEE Abstract-- Wideband Code Division Multiple Access (WCDMA)

More information

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM

KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM KONKANI SPEECH RECOGNITION USING HILBERT-HUANG TRANSFORM Shruthi S Prabhu 1, Nayana C G 2, Ashwini B N 3, Dr. Parameshachari B D 4 Assistant Professor, Department of Telecommunication Engineering, GSSSIETW,

More information

FOURIER analysis is a well-known method for nonparametric

FOURIER analysis is a well-known method for nonparametric 386 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 1, FEBRUARY 2005 Resonator-Based Nonparametric Identification of Linear Systems László Sujbert, Member, IEEE, Gábor Péceli, Fellow,

More information

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,

More information

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach Vol., No. 6, 0 Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach Zhixin Chen ILX Lightwave Corporation Bozeman, Montana, USA chen.zhixin.mt@gmail.com Abstract This paper

More information

S PG Course in Radio Communications. Orthogonal Frequency Division Multiplexing Yu, Chia-Hao. Yu, Chia-Hao 7.2.

S PG Course in Radio Communications. Orthogonal Frequency Division Multiplexing Yu, Chia-Hao. Yu, Chia-Hao 7.2. S-72.4210 PG Course in Radio Communications Orthogonal Frequency Division Multiplexing Yu, Chia-Hao chyu@cc.hut.fi 7.2.2006 Outline OFDM History OFDM Applications OFDM Principles Spectral shaping Synchronization

More information

A Parametric Model for Spectral Sound Synthesis of Musical Sounds

A Parametric Model for Spectral Sound Synthesis of Musical Sounds A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick

More information

Lecture 9: Time & Pitch Scaling

Lecture 9: Time & Pitch Scaling ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 9: Time & Pitch Scaling 1. Time Scale Modification (TSM) 2. Time-Domain Approaches 3. The Phase Vocoder 4. Sinusoidal Approach Dan Ellis Dept. Electrical Engineering,

More information

Audio Watermarking Based on Multiple Echoes Hiding for FM Radio

Audio Watermarking Based on Multiple Echoes Hiding for FM Radio INTERSPEECH 2014 Audio Watermarking Based on Multiple Echoes Hiding for FM Radio Xuejun Zhang, Xiang Xie Beijing Institute of Technology Zhangxuejun0910@163.com,xiexiang@bit.edu.cn Abstract An audio watermarking

More information

HUMAN speech is frequently encountered in several

HUMAN speech is frequently encountered in several 1948 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 7, SEPTEMBER 2012 Enhancement of Single-Channel Periodic Signals in the Time-Domain Jesper Rindom Jensen, Student Member,

More information

Time Delay Estimation: Applications and Algorithms

Time Delay Estimation: Applications and Algorithms Time Delay Estimation: Applications and Algorithms Hing Cheung So http://www.ee.cityu.edu.hk/~hcso Department of Electronic Engineering City University of Hong Kong H. C. So Page 1 Outline Introduction

More information

Reducing comb filtering on different musical instruments using time delay estimation

Reducing comb filtering on different musical instruments using time delay estimation Reducing comb filtering on different musical instruments using time delay estimation Alice Clifford and Josh Reiss Queen Mary, University of London alice.clifford@eecs.qmul.ac.uk Abstract Comb filtering

More information