E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

Size: px
Start display at page:

Download "E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21"

Transcription

1 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing / 21

2 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1 n 2 n Transformation n Synthesis Separate Source/excitation fine time/frequency structure (e.g. pitch) Filter broad spectral shape (resonances) Similar to subtractive synthesis Satisfying physical interpretation for real-world signals Easier to make sense of than e.g. phase n E85.267: Lecture 8 Source-Filter Processing / 21

3 Human speech production Reasonable approximation to speech signals: Source is oscillation of vocal chords e.g. normal speech (varying pitches) vs whispering Filtered by vocal tract (throat + tongue + lips) e.g. oooh vs aaah resonances = formants Both are time-varying E85.267: Lecture 8 Source-Filter Processing / 21

4 Source filter model Excitation source Resonance filter t t f x 1 3 time signal of pred. error e(n) magnitude spectra X(f) and G" H(f) in db n! f/khz! E85.267: Lecture 8 Source-Filter Processing / 21

5 Formants in speech h ε z has e a w t cl c ^ θ I n I watch thin as a dime z I d a y m E85.267: Lecture 8 Source-Filter Processing / 21

6 How to separate the source and filter? Source Signal x(n) 1 H 1(z) e (n) 1 Source Signal Processing H (z) 2 y(n) Chan. Voc. LPC Cepstrum Spectral Envelope Estimation Spectral Envelope Transformation Short-time analysis For each frame, estimate spectral envelope (filter response) 1 Channel vocoder (frequency-domain) 2 Linear Predictive Coding (LPC) (time-domain) 3 Cepstral analysis Source signal is whats left over (residual) after whitening E85.267: Lecture 8 Source-Filter Processing / 21

7 Channel vocoder (a) BP 1 2 x BP1 (n) ( ) 2 LP x (n) RMS1 Wideband STFT filterbank but using relatively few filters Linearly spaced with equal bandwidth (STFT) Logarithmically spaced (constant-q filter bank) Take RMS energy in each frequency band x(n) (b) BP 1 2 x BP2 (n) BP 2 ( ) 2 LP x (n) RMS2 2 x BPk (n) BP k ( ) 2 LP x (n) RMSk Octave-spaced channel stacking BP 2 BP k Equally-spaced channel stacking f BP 1 BP 2 BP k f E85.267: Lecture 8 Source-Filter Processing / 21

8 Channel vocoder using FFT Short time spectrum and spectral envelope X(f)/dB f/hz! Lowpass filter magnitude of each STFT frame i.e. filter columns of the spectrogram E85.267: Lecture 8 Source-Filter Processing / 21

9 Linear predictive coding Predict next input sample as linear combination of previous samples _ e(n) x(n) z -1 z -1 z -1 a 1 a 2 a p Filter is described by a few filter coefficients for each frame p x m [n] ˆx[n] = a k x[n k] k=1 Excitation is whats left after filtering (residual aka prediction error) p e[n] = x[n] ˆx[n] = x[n] a k x[n k] k=1 ^x(n) E85.267: Lecture 8 Source-Filter Processing / 21

10 LPC analysis/synthesis x(n) e(n) ~ e(n) y(n) _ P(z) ^x(n) P(z) (a) (a) LPC analysis (b) (b) LPC synthesis P(z) is just an FIR filter: P(z) = p k=1 a kz k Excitation is still a filtered version of the input: E(x) = X (z) (1 P(z)) For synthesis, pass (approximate) excitation through the inverse filter: Y (z) = Ẽ(z)H(z) 1 H(z) = 1 P(z) all-pole autoregressive (AR) modeling E85.267: Lecture 8 Source-Filter Processing / 21

11 LPC - varying filter order LPC filter H(z) models the spectrum of x[n] Minimizing the energy of the residual e[n] gives optimal coefficients ( {a k } = argmin x[n] ) 2 a k x[m k] a k n k The approximation improves with increasing filter order p 1 X(f) /db spectra of original and LPC filters 5 5 p=1 p=2 p=4 p=6 p=8 p= f/khz! E85.267: Lecture 8 Source-Filter Processing / 21

12 Estimating LPC parameters Set derivative of n e2 [n] w.r.t. a k zero and solve for a k : e 2 [n] = a k End up with p linear equations involving autocorrelations of x: x[m]x[m k] = a k x[m i]x[m k] m i m Solve using Levinson-Durbin recursion n E85.267: Lecture 8 Source-Filter Processing / 21

13 LPC example windowed original -.2 LPC residual db original spectrum LPC spectrum -2 time / samp -4 residual spectrum freq / Hz Filter poles z-plane E85.267: Lecture 8 Source-Filter Processing / 21

14 Short-Time LP Analysis Short-time LPC analysis Solve LPC for each ~2 ms frame freq / khz Imaginary Part freq / khz Real Part E85.267: Lecture 8 Source-Filter Processing time / s 14 / 21

15 Cepstral analysis cepstrum = String.reverse( spec ) + trum Entire lexicon of funny anagrams Insight: source and filter add in the log spectral domain Makes them easy to separate X (z) = E(z)H(z) log X (z) = log E(z) + log H(z) Real Cepstrum Spectral Envelope y(n)=x(n) * h(n) FFT Y(k) log Y(k) Y^ (k) R IFFT c(n) c (n) h FFT C h(k)= log H(k) w(n) w LP(n) Source Envelope c (n) x FFT C x(k)= log X(k) w HP(n) E85.267: Lecture 8 Source-Filter Processing / 21

16 Liftering example By low-pass liftering the cepstrum we obtain the spectral envelope of the signal E85.267: Lecture 8 Source-Filter Processing / 21

17 Liftering example 2 Original waveform has excitation fine structure convolved with resonances DFT shows harmonics modulated by resonances Log DFT is sum of harmonic comb and resonant bumps IDFT separates out resonant bumps (low quefrency) and regular, fine structure ( pitch pulse ) Selecting low-n cepstrum separates resonance information (deconvolution / liftering ).2 Waveform and min. phase IR abs(dft) and liftered log(abs(dft)) and liftered db real cepstrum and lifter samps freq / Hz freq / Hz pitch pulse quefrency E85.267: Lecture 8 Source-Filter Processing / 21

18 prediction filter and residual Applications LP recombining analysis - Speech on them coding ~2ms should yield frames perfect gives s[n] prediction coding applications filter further A(z) and compress residual e[n] e[n] recombining Encoder Filter coefficients them {ai} should Decoder yield perfect s[n] 1 /A(e j! ) coding applications Represent further compress e[n] Input s[n] Input s[n] Encoder LPC f & encode analysis Filter coefficients {ai} Represent Residual & encode 1 /A(e e[n] ) t f LPC Represent & encode e[n] ^ Excitation Decoder generator All-pole filter H(z) = "a i z -i e.g. analysis simple pitch tracker! buzz-hiss encoding e[n] Low bitrate speech codec used Represent in cell phonesexcitation is based ^ All-pole on LPC Pitch period Residual values & encode16 ms frame boundaries generator filter e[n] Quantize LPC filter 1 parameters, use crude approximation to residual time / s E4896 Music Signal Processing Pitch period (Dan values Ellis) 16 ms frame boundaries /16 1 Output s[n] ^ t H(z) = 1 Many different 5 ways to represent filter params: 1 - "a i z -i -5 e.g. simple pitch tracker! buzz-hiss encoding Output s[n] ^ prediction coefficients {a k }, roots of 1 P(z), line spectral frequencies Switch between noise and pulse train for excitation time / s 896 Music Signal UseProcessing codebook(dan of excitations Ellis) (CELP: Code Excited Linear Prediction) E85.267: Lecture 8 Source-Filter Processing / 21

19 Applications - Cross-synthesis/Vocoding freq / Hz freq / Hz Original (mpgr1_sx419) Noise-excited LPC resynthesis with pole freqs time / s Reconstruct using excitation from one sound and filter from another Whisperization: replace excitation with white noise E85.267: Lecture 8 Source-Filter Processing / 21

20 arps Still more frequencies applications but not magnitudes αz +1 8 Original Frequency = ^ Frequency Time Warped LPC resynth, = Time Process formants independent of pitch Pitch-shifting while preserving formants Processing (Dan Ellis) /16 Shift formants while preserving pitch dpwe/resources/matlab/polewarp/ Voice transformation Pitch-analysis E85.267: Lecture 8 Source-Filter Processing / 21

21 Reading DAFX Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing / 21

Lecture 6: Speech modeling and synthesis

Lecture 6: Speech modeling and synthesis EE E682: Speech & Audio Processing & Recognition Lecture 6: Speech modeling and synthesis 1 2 3 4 5 Modeling speech signals Spectral and cepstral models Linear Predictive models (LPC) Other signal models

More information

Lecture 5: Speech modeling. The speech signal

Lecture 5: Speech modeling. The speech signal EE E68: Speech & Audio Processing & Recognition Lecture 5: Speech modeling 1 3 4 5 Modeling speech signals Spectral and cepstral models Linear Predictive models (LPC) Other signal models Speech synthesis

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Lecture 5: Speech modeling

Lecture 5: Speech modeling CSC 836: Speech & Audio Understanding Lecture 5: Speech modeling Dan Ellis CUNY Graduate Center, Computer Science Program http://mr-pc.org/t/csc836 With much content from Dan Ellis

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Cepstrum alanysis of speech signals

Cepstrum alanysis of speech signals Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis Signal Analysis Music 27a: Signal Analysis Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD November 23, 215 Some tools we may want to use to automate analysis

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

Lecture 5: Sinusoidal Modeling

Lecture 5: Sinusoidal Modeling ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering,

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Module 9: Multirate Digital Signal Processing Prof. Eliathamby Ambikairajah Dr. Tharmarajah Thiruvaran School of Electrical Engineering &

Module 9: Multirate Digital Signal Processing Prof. Eliathamby Ambikairajah Dr. Tharmarajah Thiruvaran School of Electrical Engineering & odule 9: ultirate Digital Signal Processing Prof. Eliathamby Ambikairajah Dr. Tharmarajah Thiruvaran School of Electrical Engineering & Telecommunications The University of New South Wales Australia ultirate

More information

SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION

SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION M.Tech. Credit Seminar Report, Electronic Systems Group, EE Dept, IIT Bombay, submitted November 04 SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION G. Gidda Reddy (Roll no. 04307046)

More information

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012 Signal segmentation and waveform characterization Biosignal processing, 5173S Autumn 01 Short-time analysis of signals Signal statistics may vary in time: nonstationary how to compute signal characterizations?

More information

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio Topic Spectrogram Chromagram Cesptrogram Short time Fourier Transform Break signal into windows Calculate DFT of each window The Spectrogram spectrogram(y,1024,512,1024,fs,'yaxis'); A series of short term

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #27 Tuesday, November 11, 23 6. SPECTRAL ANALYSIS AND ESTIMATION 6.1 Introduction to Spectral Analysis and Estimation The discrete-time Fourier

More information

Chapter 7. Frequency-Domain Representations 语音信号的频域表征

Chapter 7. Frequency-Domain Representations 语音信号的频域表征 Chapter 7 Frequency-Domain Representations 语音信号的频域表征 1 General Discrete-Time Model of Speech Production Voiced Speech: A V P(z)G(z)V(z)R(z) Unvoiced Speech: A N N(z)V(z)R(z) 2 DTFT and DFT of Speech The

More information

Resonator Factoring. Julius Smith and Nelson Lee

Resonator Factoring. Julius Smith and Nelson Lee Resonator Factoring Julius Smith and Nelson Lee RealSimple Project Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California 9435 March 13,

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8 WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels See Rogers chapter 7 8 Allows us to see Waveform Spectrogram (color or gray) Spectral section short-time spectrum = spectrum of a brief

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

Formant Synthesis of Haegeum: A Sound Analysis/Synthesis System using Cpestral Envelope

Formant Synthesis of Haegeum: A Sound Analysis/Synthesis System using Cpestral Envelope Formant Synthesis of Haegeum: A Sound Analysis/Synthesis System using Cpestral Envelope Myeongsu Kang School of Computer Engineering and Information Technology Ulsan, South Korea ilmareboy@ulsan.ac.kr

More information

Robust Algorithms For Speech Reconstruction On Mobile Devices

Robust Algorithms For Speech Reconstruction On Mobile Devices Robust Algorithms For Speech Reconstruction On Mobile Devices XU SHAO A Thesis presented for the degree of Doctor of Philosophy Speech Group School of Computing Sciences University of East Anglia England

More information

Lecture 6: Nonspeech and Music

Lecture 6: Nonspeech and Music EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 1 2 3 4 5 Music and nonspeech Environmental sounds Music synthesis techniques Sinewave synthesis Music analysis Dan Ellis

More information

Lecture 6: Nonspeech and Music. Music & nonspeech

Lecture 6: Nonspeech and Music. Music & nonspeech EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 2 3 4 5 Music and nonspeech Environmental sounds Music synthesis techniques Sinewave synthesis Music analysis Dan Ellis

More information

Rotating Machinery Fault Diagnosis Techniques Envelope and Cepstrum Analyses

Rotating Machinery Fault Diagnosis Techniques Envelope and Cepstrum Analyses Rotating Machinery Fault Diagnosis Techniques Envelope and Cepstrum Analyses Spectra Quest, Inc. 8205 Hermitage Road, Richmond, VA 23228, USA Tel: (804) 261-3300 www.spectraquest.com October 2006 ABSTRACT

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

Sound Synthesis Methods

Sound Synthesis Methods Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

B.Tech III Year II Semester (R13) Regular & Supplementary Examinations May/June 2017 DIGITAL SIGNAL PROCESSING (Common to ECE and EIE)

B.Tech III Year II Semester (R13) Regular & Supplementary Examinations May/June 2017 DIGITAL SIGNAL PROCESSING (Common to ECE and EIE) Code: 13A04602 R13 B.Tech III Year II Semester (R13) Regular & Supplementary Examinations May/June 2017 (Common to ECE and EIE) PART A (Compulsory Question) 1 Answer the following: (10 X 02 = 20 Marks)

More information

Page 0 of 23. MELP Vocoder

Page 0 of 23. MELP Vocoder Page 0 of 23 MELP Vocoder Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic

More information

Analysis and Synthesis of Pathological Vowels

Analysis and Synthesis of Pathological Vowels Analysis and Synthesis of Pathological Vowels Prospectus Brian C. Gabelman 6/13/23 1 OVERVIEW OF PRESENTATION I. Background II. Analysis of pathological voices III. Synthesis of pathological voices IV.

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic

More information

Chapter 9. Chapter 9 275

Chapter 9. Chapter 9 275 Chapter 9 Chapter 9: Multirate Digital Signal Processing... 76 9. Decimation... 76 9. Interpolation... 8 9.. Linear Interpolation... 85 9.. Sampling rate conversion by Non-integer factors... 86 9.. Illustration

More information

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation

More information

Signal processing preliminaries

Signal processing preliminaries Signal processing preliminaries ISMIR Graduate School, October 4th-9th, 2004 Contents: Digital audio signals Fourier transform Spectrum estimation Filters Signal Proc. 2 1 Digital signals Advantages of

More information

PR No. 119 DIGITAL SIGNAL PROCESSING XVIII. Academic Research Staff. Prof. Alan V. Oppenheim Prof. James H. McClellan.

PR No. 119 DIGITAL SIGNAL PROCESSING XVIII. Academic Research Staff. Prof. Alan V. Oppenheim Prof. James H. McClellan. XVIII. DIGITAL SIGNAL PROCESSING Academic Research Staff Prof. Alan V. Oppenheim Prof. James H. McClellan Graduate Students Bir Bhanu Gary E. Kopec Thomas F. Quatieri, Jr. Patrick W. Bosshart Jae S. Lim

More information

Synthesis Techniques. Juan P Bello

Synthesis Techniques. Juan P Bello Synthesis Techniques Juan P Bello Synthesis It implies the artificial construction of a complex body by combining its elements. Complex body: acoustic signal (sound) Elements: parameters and/or basic signals

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Digital Speech Processing- Lecture 14A Algorithms for Speech Processing Speech Processing Algorithms Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Single speech

More information

MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting

MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA)

More information

ENEE408G Multimedia Signal Processing

ENEE408G Multimedia Signal Processing ENEE408G Multimedia Signal Processing Design Project on Digital Speech Processing Goals: 1. Learn how to use the linear predictive model for speech analysis and synthesis. 2. Implement a linear predictive

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

FX Basics. Filtering STOMPBOX DESIGN WORKSHOP. Esteban Maestre. CCRMA - Stanford University August 2013

FX Basics. Filtering STOMPBOX DESIGN WORKSHOP. Esteban Maestre. CCRMA - Stanford University August 2013 FX Basics STOMPBOX DESIGN WORKSHOP Esteban Maestre CCRMA - Stanford University August 2013 effects modify the frequency content of the audio signal, achieving boosting or weakening specific frequency bands

More information

Discrete Fourier Transform (DFT)

Discrete Fourier Transform (DFT) Amplitude Amplitude Discrete Fourier Transform (DFT) DFT transforms the time domain signal samples to the frequency domain components. DFT Signal Spectrum Time Frequency DFT is often used to do frequency

More information

FFT analysis in practice

FFT analysis in practice FFT analysis in practice Perception & Multimedia Computing Lecture 13 Rebecca Fiebrink Lecturer, Department of Computing Goldsmiths, University of London 1 Last Week Review of complex numbers: rectangular

More information

DSP Based Corrections of Analog Components in Digital Receivers

DSP Based Corrections of Analog Components in Digital Receivers fred harris DSP Based Corrections of Analog Components in Digital Receivers IEEE Communications, Signal Processing, and Vehicular Technology Chapters Coastal Los Angeles Section 24-April 2008 It s all

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Speech Production. Automatic Speech Recognition handout (1) Jan - Mar 2009 Revision : 1.1. Speech Communication. Spectrogram. Waveform.

Speech Production. Automatic Speech Recognition handout (1) Jan - Mar 2009 Revision : 1.1. Speech Communication. Spectrogram. Waveform. Speech Production Automatic Speech Recognition handout () Jan - Mar 29 Revision :. Speech Signal Processing and Feature Extraction lips teeth nasal cavity oral cavity tongue lang S( Ω) pharynx larynx vocal

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

Acoustics, signals & systems for audiology. Week 4. Signals through Systems Acoustics, signals & systems for audiology Week 4 Signals through Systems Crucial ideas Any signal can be constructed as a sum of sine waves In a linear time-invariant (LTI) system, the response to a sinusoid

More information

Lecture 6: Nonspeech and Music

Lecture 6: Nonspeech and Music EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 1 Music & nonspeech Dan Ellis Michael Mandel 2 Environmental Sounds Columbia

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Speech Synthesis Spring,1999 Lecture 23 N.MORGAN

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

Audio processing methods on marine mammal vocalizations

Audio processing methods on marine mammal vocalizations Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio http://labrosa.ee.columbia.edu Sound to Signal sound is pressure

More information

Converting Speaking Voice into Singing Voice

Converting Speaking Voice into Singing Voice Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech

More information

Lecture 9: Time & Pitch Scaling

Lecture 9: Time & Pitch Scaling ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 9: Time & Pitch Scaling 1. Time Scale Modification (TSM) 2. Time-Domain Approaches 3. The Phase Vocoder 4. Sinusoidal Approach Dan Ellis Dept. Electrical Engineering,

More information

Fundamental Frequency Detection

Fundamental Frequency Detection Fundamental Frequency Detection Jan Černocký, Valentina Hubeika {cernocky ihubeika}@fit.vutbr.cz DCGM FIT BUT Brno Fundamental Frequency Detection Jan Černocký, Valentina Hubeika, DCGM FIT BUT Brno 1/37

More information

SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction

SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction by Xi Li A thesis submitted to the Faculty of Graduate School, Marquette University, in Partial Fulfillment of the Requirements

More information

A Comparative Study of Formant Frequencies Estimation Techniques

A Comparative Study of Formant Frequencies Estimation Techniques A Comparative Study of Formant Frequencies Estimation Techniques DORRA GARGOURI, Med ALI KAMMOUN and AHMED BEN HAMIDA Unité de traitement de l information et électronique médicale, ENIS University of Sfax

More information

-voiced. +voiced. /z/ /s/ Last Lecture. Digital Speech Processing. Overview of Speech Processing. Example on Sound Source Feature

-voiced. +voiced. /z/ /s/ Last Lecture. Digital Speech Processing. Overview of Speech Processing. Example on Sound Source Feature ENEE408G Lecture-6 Digital Speech rocessing URL: http://www.ece.umd.edu/class/enee408g/ Slides included here are based on Spring 005 offering in the order of introduction, image, video, speech, and audio.

More information

Discrete Fourier Transform, DFT Input: N time samples

Discrete Fourier Transform, DFT Input: N time samples EE445M/EE38L.6 Lecture. Lecture objectives are to: The Discrete Fourier Transform Windowing Use DFT to design a FIR digital filter Discrete Fourier Transform, DFT Input: time samples {a n = {a,a,a 2,,a

More information

EECS 452 Midterm Exam Winter 2012

EECS 452 Midterm Exam Winter 2012 EECS 452 Midterm Exam Winter 2012 Name: unique name: Sign the honor code: I have neither given nor received aid on this exam nor observed anyone else doing so. Scores: # Points Section I /40 Section II

More information

Linear Predictive Coding *

Linear Predictive Coding * OpenStax-CNX module: m45345 1 Linear Predictive Coding * Kiefer Forseth This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 1 LPC Implementation Linear

More information

High-Pitch Formant Estimation by Exploiting Temporal Change of Pitch

High-Pitch Formant Estimation by Exploiting Temporal Change of Pitch High-Pitch Formant Estimation by Exploiting Temporal Change of Pitch The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published

More information

SPEECH AND SPECTRAL ANALYSIS

SPEECH AND SPECTRAL ANALYSIS SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs

More information

Adaptive Filters Linear Prediction

Adaptive Filters Linear Prediction Adaptive Filters Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Slide 1 Contents

More information

Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping

Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping Rizwan Ishaq 1, Dhananjaya Gowda 2, Paavo Alku 2, Begoña García Zapirain 1

More information

Final Exam Practice Questions for Music 421, with Solutions

Final Exam Practice Questions for Music 421, with Solutions Final Exam Practice Questions for Music 4, with Solutions Elementary Fourier Relationships. For the window w = [/,,/ ], what is (a) the dc magnitude of the window transform? + (b) the magnitude at half

More information

Subtractive Synthesis & Formant Synthesis

Subtractive Synthesis & Formant Synthesis Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/

More information

Copyright S. K. Mitra

Copyright S. K. Mitra 1 In many applications, a discrete-time signal x[n] is split into a number of subband signals by means of an analysis filter bank The subband signals are then processed Finally, the processed subband signals

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

ADSP ADSP ADSP ADSP. Advanced Digital Signal Processing (18-792) Spring Fall Semester, Department of Electrical and Computer Engineering

ADSP ADSP ADSP ADSP. Advanced Digital Signal Processing (18-792) Spring Fall Semester, Department of Electrical and Computer Engineering ADSP ADSP ADSP ADSP Advanced Digital Signal Processing (18-792) Spring Fall Semester, 201 2012 Department of Electrical and Computer Engineering PROBLEM SET 5 Issued: 9/27/18 Due: 10/3/18 Reminder: Quiz

More information

Topic 2. Signal Processing Review. (Some slides are adapted from Bryan Pardo s course slides on Machine Perception of Music)

Topic 2. Signal Processing Review. (Some slides are adapted from Bryan Pardo s course slides on Machine Perception of Music) Topic 2 Signal Processing Review (Some slides are adapted from Bryan Pardo s course slides on Machine Perception of Music) Recording Sound Mechanical Vibration Pressure Waves Motion->Voltage Transducer

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,

More information

Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB

Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB, T/RV 3th May 2 2 Abstract This report presents an

More information