Analysis and Synthesis of Pathological Vowels

Size: px
Start display at page:

Download "Analysis and Synthesis of Pathological Vowels"

Transcription

1 Analysis and Synthesis of Pathological Vowels Prospectus Brian C. Gabelman 6/13/23 1

2 OVERVIEW OF PRESENTATION I. Background II. Analysis of pathological voices III. Synthesis of pathological voices IV. Summary 2

3 BACKGROUND What is a pathological vowel? May be caused by physical or neural problems. Characterized by substantial and complex NONPERIODIC signal components. What is this project? Methods for modeling, analysis, and synthesis of pathological vowels incorporating novel approaches in: - System identification - Parameterization of non-periodic components (AM, FM, and noise) - Synthesizer designs for realtime and offline use 3

4 Why do it? BACKGROUND - Create a non-subjective basis to compare pathological voices for: 1. Improved diagnosis 2. Tracking changes in a patient s voice - Generate voice samples with known levels of variations (noise, roughness, etc.) for: 1. Evaluation of model parameters 2. Evaluation of listener variability 3. Evaluation of importance of levels of pathological features. What has been done before? - A well-established theory exists for NORMAL voices - Recent studies of pathological voices employ perturbation of normal features plus additive noise. - ES (external source) stimulation of the vocal tract to analyze formants (vowels) since 1942 for NORMAL voices. 4

5 BACKGROUND For the theoretical/analytical aspect of the project, an expression of the hypothesis of the dissertation in one sentence is: By means of FM and AM demodulation techniques, estimation of nonperiodic features of pathological vowels may be improved. 5

6 ANALYSIS SOURCE - FILTER MODEL OF SPEECH GLOTTAL SOURCE WAVEFORM (time domain) VOCAL TRACT FREQ. RESP. (freq domain) RESULTING VOICE SIG. (time domain) g(t) G(s) f(t) F(s) v(t) V(s) g(t) conv. f(t) = v(t) (time domain) G(s) x F(s) = V(s) (freq domain) 6

7 Steps for analysis: ANALYSIS PERIODIC ANALYSIS: 1. FORMANT DETERMINATION Uses LP (linear prediction) to model vocal tract as cascaded 2nd order digital resonators. External source testing is shown to augment or replace LP for pathological vowels (Inv). 2. SOURCE MODELING Uses inverse filtering and least squares optimization to fit source waveform to a standard model (LF). NONPERIODIC ANALYSIS: 3. ANALYSIS OF PITCH VARIATION Uses high resolution pitch tracking to measure detailed nonperiodic frequency variation. Variations are segmented into low and high frequeny FM with Gaussian form. 7

8 ANALYSIS 4. FM DEMODULATION Pitch variations are removed from original voice to achieve accurate noise estimation (step 7) (Inv.) 5. ANALYSIS OF POWER VARIATION Uses power tracking to measure detailed nonperiodic loudness variations. Variations are segmented into low and high frequency AM with Gaussian form. 6. AM DEMODULATION Power variations are removed from original voice to achieve accurate noise estimation (step 7) (Inv.) 7. ASPIRATION NOISE Frequency domain methods are used to separate aspiration noise component. The noise is spectrally modeled.[gus de Krom] 8

9 ANALYSIS BY SYNTHESIS ANALYSIS COLLECTION OF VOICE SAMPLES VOICE ANALYSIS + PARAMETERS (PITCH, NSR, FORMANTS,..) PERCEPTUAL COMPARISONS - VALIDATION VOICE SYNTHESIS 9

10 ANALYSIS ANALYSIS/SYNTHESIS MODEL OVERVIEW SOURCE WAVEFORM PERIODIC FM MODULATION SYNTHESIS ANALYSIS AM MODULATION NONPERIODIC + + VOCAL TRACT OUTPUT VOICE SPECTRAL SHAPING GAUSSIAN NOISE 1

11 ANALYSIS OVERVIEW OF PROJECT OPERATONS MIC SIGNAL LPC FORMANT ANALYSIS / MANUAL OPS PITCH TRACKER JITTER ESTIMATE. TREMOR TIME HIST POWER TRACKER SHIMMER ESTIMATE VOLUME TIME HIST FORMANTS PITCH TIME HIST POWER TIME HIST INVERSE FILTERING RESAMPLE TO REMOVE TREMOR MIKE SIGNAL RESAMPLE TO REMOVE TREMOR SYNTHESIZER RAW FLOW DERIVATIVE CONST. PITCH VOICE CONST. POWER VOICE LEAST SQR LF FIT CEPSTRAL NOISE ANALYSIS FITTED LF SOURCE PULSE SRC NOISE SPECTRUM NSR ESTIMATE 11

12 LINEAR PREDICTION PERIODIC ANALYSIS Estimates the vocal tract as an all-pole filter by minimizing the error between actual and model-predicted signals. (SOURCE) (VOCAL TRACT) (ERROR) u(n) 1 p a k z k = 1 G k s(n) (MODEL) + - e(n) UNKOWN SYSTEM H(z) PREDICTOR p α k = 1 k k z s (n) ESTIMATED INVERSE SYSTEM A(z) p k = 1 α G/H(z) k= 1 k z 12

13 PERIODIC ANALYSIS IDEALIZED LP RESULT Requires a priori knowledge of system. More difficult for pathological vowels. Imaginary part LPC COVARIANCE ROOTS. O=TRUE X=LPC POLE REFLECTED INSIDE UNIT CIRCLE Real part 25 LPC COV. PREDICTOR: LINE = ACTUAL OUT, DOT= PREDICTED ERROR SIGNAL OF COVARIANCE LPA = INVERSE FILTERED OUTPUT

14 PERIODIC ANALYSIS SOURCE-FILTER AMBIGUITY Source & filter are mixed in final voice. Unique LP solution may be difficult. CASE : NORMAL SOURCE AND NORMAL VOCAL TRACT SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIESVOICE SPECTRUM CASE 1: BREATHY SOURCE AND NORMAL VOCAL TRACT SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIESVOICE SPECTRUM CASE 2: NORMAL SOURCE AND ABNORMAL VOCAL TRACT SOURCE TIME SERIES VOCAL TRACT TRANSFER FUNCTION VOICE TIME SERIESVOICE SPECTRUM 14

15 PERIODIC ANALYSIS FITTING RAW SOURCE TO LF Having established the inverse filtered source, it is fit to the LF model [Qi] 15 E1 (FIRST SEG.) E2 (SECOND SEG.) FLOW U(t) & DERIVATIVE E(t) (tpk,epk) tp 3*E(t) U(t) (t2,ee/2) (te,ee) = E ( t ) dt tc T (SECONDS) E( t) = E 2B E ( t) = E E 2 1 ( t) = E ( t) = E e e e e αt e ε ( t t ) sinω t, ε ( t t e e g ), t e + m( t te), t t t e e t t c t t c 15

16 NON-PERIODIC ANALYSIS HIGH RESOLUTION PITCH TRACKING Nonperiodic analysis begins with interpolating pitch tracking. 8 6 VOICE SIGNAL 4 VOICE SIG Measured Period TIME (SEC) PITCH TRACK FREQ (HZ) TIME (SEC) 16

17 NON-PERIODIC ANALYSIS FM DEVIATION SEGREATED TO LOW AND HIGH FREQUENCY The pitch track is low/hi pass filtered to yield tremor and HFPV (High Frequency Pitch Variation) FREQ HZ SEC 4 2 FREQ HZ SEC 17

18 NON-PERIODIC ANALYSIS GAUSSIAN HFPV Successful pitch tracking yields a Gaussian distribution in HPFV. The standard deviation is a convenient measure of HFPV #OCCURRANCES DELT FREQ (%) TOTSKIP= FCUT=1 TOTMAN= 18

19 NON-PERIODIC ANALYSIS FM DEMODULATION The pitch track may be used to demodulate the original voice to obtain a version with almost no pitch variation; re-tracking verifies constant pitch (<.1%)..1.5 DELTA FREQ PERCENT TIME (SEC) 19

20 SMOOTHED ABS ORIG VOICE x PITCH TRACK FEATURES SAMPLE NUMBER NON-PERIODIC ANALYSIS POWER TRACKING Analogously to pitch tracking, voice power is tracked. ENVELOPE MINIMA POWER TRACK.8 POWER x 1 4 SAMPLE # 2

21 POWER SEGREGATED TO LOW AND HIGH FREQUENCY NON-PERIODIC ANALYSIS The power track is low/hi pass filtered to yield low frequency power variations and high frequency shimmer. 2 x 1 8 ORIGINAL POWER ( ) AND TREMOR (LINE) 1.5 POWER TIME (SEC) 2 SHIM% = 1*(POWER - LOWPASS POWER)/POWER DELTA POWER PERCENT TIME (SEC) 21

22 NON-PERIODIC ANALYSIS GAUSSIAN POWER VARIATIONS Shimmer also displays Gaussian variations. 4 SHIM% HIST. STAND DEV = 4.823% #OCCURRANCES DELT PWR (%) 22

23 NON-PERIODIC ANALYSIS AM DEMODULATION The power track may be used to demodulate the original voice to obtain a version with almost no variation in strength; re-tracking verifies constant power. 1 VARIOUS MEASURES OF SIGNAL STRENGTH.95 POWER SUM ENERGY.9.85 MAX AMPL SAMPLE # x

24 LOG1 MAGNITUDE NON-PERIODIC ANALYSIS ASPIRATION NOISE ANALYSIS Aspiration noise is segregated via spectral techniques. Peaks in the FFT of the log of the FFT (cepstrum) represent periodic energy, and are filtered out with a comb filter (lifter). Results are used to calculate noise-to-signal ratio (NSR) LOG1 MAGNITUDE TIME (SEC) Figure 2.3c. Cepstrum (expanded scale) FREQUENCY 2 (HZ) TIME (SEC) Figure 2.3a. PSD oforiginal voice. Figure 2.3b. Cepstrum of original voice LOG1 MAGNITUDE TIME (SEC) Figure 2.3d. Comb-liftered cepstrum of 14c FREQUENCY (HZ) Figure 2.3e. Orig PSD, aspriation PSD, 1 FREQUENCY 2 3 (HZ) 4 5 and vocal tract. Figure 2.3f. Source aspiration PSD with vocal tract removed and fitted to 25-point piecewise-linear model. 24

25 NON-PERIODIC ANALYSIS FM DEMODULATION IMPROVES NSR ACCURACY Using FM demodulation improves resolution of spectral peaks of periodic components, thus allowing longer FFT windows and more accurate NSR determination LOG1 MAGNITUDE FREQUENCY (HZ) 25

26 NON-PERIODIC ANALYSIS CHANGES IN NSR AFTER FM AND AM DEMODULATION FM demodulation reduces NSR measures by up to 2 db, yielding results closer to perceived levels. AM demodulation has much less effect ORIG NSR (DB) TREMOR REMOVED ALL FM REMOVED CASE# - SORTED BY ASCENDING ORIGINAL NSR 26

27 PC #1: STIMULUS GEN. PERIODIC ANALYSIS EXTERNAL (ES) SOURCE ANALYSIS Source-filter ambiguity may be resolved by augmenting the glottal source with an known external stimulus. [Epps]. STIMULUS SIGNAL D/A AMPLIFIER XDUCER ACOUSTIC CONDUIT /a/ PC #2: DAQ MEM. CH 1 A/D CH 2 SIGNAL CONDITIONER MICROPHONE VOCAL TRACT A/D SIGNAL CONDITIONER TUBE OF LENGTH L WITH CLOSED END. FORMANTS AT F = C/4L, 3C/4L, 5C/4L,... 27

28 PERIODIC ANALYSIS ES VERIFICATION A simple plastic tube model verified the ES experimental setup. Resonances occur at expected frequencies SPECTRUM WITH RAG, W/O RAG, & MAG OF T.F. (YEL) HA(f) 1 HB(f) 8 MAGNITUDE (db) TUBE FORMANTS -2-4 V(f) FREQUENCY (Hz) x

29 ES: NORMAL /a/ LP & FFT analysis show consistent results with ES analysis for a normal vowel. PERIODIC ANALYSIS 3 2 SPECTRA D1: VC LPA(DASH), VC db, EXTSRC 2dB F1 F2 EXT SRC T.F. S F3 F4-3 LPA FFT

30 PERIODIC ANALYSIS ES: SIMULATED BREATHY /a/ LP & FFT analysis show poor resolution for F3 and F4 for a breathy /a/, while the ES resolution for F3 and F4 remains good. D1BRTHY: VOICE LPA & FFT (BOT), EXTSRC TF (TOP) F1 F2 EXT SRC T.F. S -1 MAGNITUDE (db) -2-3 F3 F4 LPA FFT FREQUENCY (Hz) 3

31 SYNTHESIS SYNTHESIS OF PATHOLOGICAL VOWELS Synthesis is a critical step in the study of pathological vowels. It provides evidence of the success of analysis and modeling steps via immediate comparisons of original and synthetic voice. Two synthesizers were implemented: 1. A realtime hardware-based synthesizer capable of providing instant response to changes in model parameters. 2. A software synthesizer implemented in MATLAB with extended features, convenient graphical interface, and ease of modification. 31

32 CURRENT REALTIME SYNTHESIZER FUNCTIONAL OVERVIEW SYNTHESIS REALTIME SYNTHESIZER - Implemented in native X86 assembly language - Executes all code within 1us cycle - Overrides PC OS to achieve determinancy - Employs dedicated clock, I/O, and control hardware implemented in a wire-wrap PCB adapter card IMPL KGLOT 88 SOURCE SELECTION + 2 POLES... 4 ZEROES AGC CLK D/A X86 INTERUPT LOW PASS FILT. LF1 GAH RECORD WAV AMP LF2 ARB ASP NOISE STORE/RECALL CONTROL PARMS OUTPUT SIGNAL FILE SPKR SOURCE CONTROLS ARBITRARY SOURCE FILE JITTER SHIMMER DIPLOPHONIA PARAMETER TIME VARIATION 32

33 SYNTHESIS VALIDATION SYNTHESIS The current model, analysis tools, and synthesizers yield a high level of fidelity in generation of synthetic pathological vowels. The system is currently employed at the UCLA Voicelab for NIH funded perceptual studies. In order to objectively validate the analysis/synthesis process, the loop is closed by re-analyzing the synthetic time series to confirm parameter values. Re-analysis also provides opportunity to observe interactions of nonperiodic components. 33

34 SYNTHESIS VALIDATION SYNTHESIS Re-measured synthetic aspiration noise level agrees with level set in synthesizer. -5 MEASURED NSR IN SYNTHETIC (DB) A.N. SET NSR IN SYNTHESIZER NPB21 34

35 SYNTHESIS SYNTHESIS VALIDATION Aspiration noise adds about %.2 to HFPV measurements. JITTER MEASURED IN SYNTHETIC (%) JITTER LEVEL SET IN SYNTHESIZER (%) HFPV adds about 4dB to NSR measurements. MEASURED NSR IN SYNTH (db) A.N. SET NSR IN SYNTHESIZER (db) 35

36 SYNTHESIS SYNTHESIS VALIDATION Subjective analysis by synthesis experiments demonstrate the success of AM and FM demodulation in achieving accurate modeling of nonperiodic features. Listeners adjust synthetic aspiration noise to match original. Match improves with demodulation -5 ORIGINAL VOICE (NO DEMOD) PEARSON =.51-1 SABS ASPIRATION NOISE (DB) CEPSTRAL NSR (DB) 36

37 SYNTHESIS VALIDATION SYNTHESIS SABS ASPIRATION NOISE (DB) TREMOR REMOVED PEARSON = CEPSTRAL NSR (DB) SABS ASPIRATION NOISE (DB) ALL AM&FM REMOVED PEARSON = CEPSTRAL NSR (DB) 37

38 SUMMARY CONCLUSION This study has achieved improved automatic, objective analysis and synthesis of speech within the specialization of pathological vowels. Specific accomplishments include: - A unique, symmetric model for nonperiodic components as AM, FM and spectrally-shaped aspiration noise - Improved accuracy of noise analysis via AM & FM demodulation - Application of ES formant identification for pathological vowels. - Implementation of realtime and offline specialized high fidelity vowel synthesizers 38

Synthesis Algorithms and Validation

Synthesis Algorithms and Validation Chapter 5 Synthesis Algorithms and Validation An essential step in the study of pathological voices is re-synthesis; clear and immediate evidence of the success and accuracy of modeling efforts is provided

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Complex Sounds. Reading: Yost Ch. 4

Complex Sounds. Reading: Yost Ch. 4 Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency

More information

Glottal source model selection for stationary singing-voice by low-band envelope matching

Glottal source model selection for stationary singing-voice by low-band envelope matching Glottal source model selection for stationary singing-voice by low-band envelope matching Fernando Villavicencio Yamaha Corporation, Corporate Research & Development Center, 3 Matsunokijima, Iwata, Shizuoka,

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL

VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL Narsimh Kamath Vishweshwara Rao Preeti Rao NIT Karnataka EE Dept, IIT-Bombay EE Dept, IIT-Bombay narsimh@gmail.com vishu@ee.iitb.ac.in

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

SPEECH AND SPECTRAL ANALYSIS

SPEECH AND SPECTRAL ANALYSIS SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Speech Synthesis Spring,1999 Lecture 23 N.MORGAN

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

Lecture 6: Speech modeling and synthesis

Lecture 6: Speech modeling and synthesis EE E682: Speech & Audio Processing & Recognition Lecture 6: Speech modeling and synthesis 1 2 3 4 5 Modeling speech signals Spectral and cepstral models Linear Predictive models (LPC) Other signal models

More information

Page 0 of 23. MELP Vocoder

Page 0 of 23. MELP Vocoder Page 0 of 23 MELP Vocoder Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Perceived Pitch of Synthesized Voice with Alternate Cycles

Perceived Pitch of Synthesized Voice with Alternate Cycles Journal of Voice Vol. 16, No. 4, pp. 443 459 2002 The Voice Foundation Perceived Pitch of Synthesized Voice with Alternate Cycles Xuejing Sun and Yi Xu Department of Communication Sciences and Disorders,

More information

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis Signal Analysis Music 27a: Signal Analysis Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD November 23, 215 Some tools we may want to use to automate analysis

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

Lecture 5: Speech modeling. The speech signal

Lecture 5: Speech modeling. The speech signal EE E68: Speech & Audio Processing & Recognition Lecture 5: Speech modeling 1 3 4 5 Modeling speech signals Spectral and cepstral models Linear Predictive models (LPC) Other signal models Speech synthesis

More information

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals. XIV. SPEECH COMMUNICATION Prof. M. Halle G. W. Hughes J. M. Heinz Prof. K. N. Stevens Jane B. Arnold C. I. Malme Dr. T. T. Sandel P. T. Brady F. Poza C. G. Bell O. Fujimura G. Rosen A. AUTOMATIC RESOLUTION

More information

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8 WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels See Rogers chapter 7 8 Allows us to see Waveform Spectrogram (color or gray) Spectral section short-time spectrum = spectrum of a brief

More information

SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION

SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION M.Tech. Credit Seminar Report, Electronic Systems Group, EE Dept, IIT Bombay, submitted November 04 SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION G. Gidda Reddy (Roll no. 04307046)

More information

SOURCE-filter modeling of speech is based on exciting. Glottal Spectral Separation for Speech Synthesis

SOURCE-filter modeling of speech is based on exciting. Glottal Spectral Separation for Speech Synthesis IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 1 Glottal Spectral Separation for Speech Synthesis João P. Cabral, Korin Richmond, Member, IEEE, Junichi Yamagishi, Member, IEEE, and Steve Renals,

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM

USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM by Brandon R. Graham A report submitted in partial fulfillment of the requirements for

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Introducing COVAREP: A collaborative voice analysis repository for speech technologies

Introducing COVAREP: A collaborative voice analysis repository for speech technologies Introducing COVAREP: A collaborative voice analysis repository for speech technologies John Kane Wednesday November 27th, 2013 SIGMEDIA-group TCD COVAREP - Open-source speech processing repository 1 Introduction

More information

Analog and Telecommunication Electronics

Analog and Telecommunication Electronics Politecnico di Torino - ICT School Analog and Telecommunication Electronics D5 - Special A/D converters» Differential converters» Oversampling, noise shaping» Logarithmic conversion» Approximation, A and

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

ScienceDirect. Accuracy of Jitter and Shimmer Measurements

ScienceDirect. Accuracy of Jitter and Shimmer Measurements Available online at www.sciencedirect.com ScienceDirect Procedia Technology 16 (2014 ) 1190 1199 CENTERIS 2014 - Conference on ENTERprise Information Systems / ProjMAN 2014 - International Conference on

More information

Analysis and Synthesis of Pathological Voice Quality

Analysis and Synthesis of Pathological Voice Quality Second Edition Revised November, 2016 33 Analysis and Synthesis of Pathological Voice Quality by Jody Kreiman Bruce R. Gerratt Norma Antoñanzas-Barroso Bureau of Glottal Affairs Department of Head/Neck

More information

Chapter 2. Signals and Spectra

Chapter 2. Signals and Spectra Chapter 2 Signals and Spectra Outline Properties of Signals and Noise Fourier Transform and Spectra Power Spectral Density and Autocorrelation Function Orthogonal Series Representation of Signals and Noise

More information

Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping

Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping Rizwan Ishaq 1, Dhananjaya Gowda 2, Paavo Alku 2, Begoña García Zapirain 1

More information

Steady state phonation is never perfectly steady. Phonation is characterized

Steady state phonation is never perfectly steady. Phonation is characterized Perception of Vocal Tremor Jody Kreiman Brian Gabelman Bruce R. Gerratt The David Geffen School of Medicine at UCLA Los Angeles, CA Vocal tremors characterize many pathological voices, but acoustic-perceptual

More information

Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics

Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics Derek Tze Wei Chu and Kaiwen Li School of Physics, University of New South Wales, Sydney,

More information

Lab 9 Fourier Synthesis and Analysis

Lab 9 Fourier Synthesis and Analysis Lab 9 Fourier Synthesis and Analysis In this lab you will use a number of electronic instruments to explore Fourier synthesis and analysis. As you know, any periodic waveform can be represented by a sum

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

Telecommunication Electronics

Telecommunication Electronics Politecnico di Torino ICT School Telecommunication Electronics C5 - Special A/D converters» Logarithmic conversion» Approximation, A and µ laws» Differential converters» Oversampling, noise shaping Logarithmic

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical

More information

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 1. Resonators and Filters INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 Different vibrating objects are tuned to specific frequencies; these frequencies at which a particular

More information

Synthesis Techniques. Juan P Bello

Synthesis Techniques. Juan P Bello Synthesis Techniques Juan P Bello Synthesis It implies the artificial construction of a complex body by combining its elements. Complex body: acoustic signal (sound) Elements: parameters and/or basic signals

More information

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz Between physics and perception signal models for high level audio processing Axel Röbel Analysis / synthesis team, IRCAM DAFx 2010 iem Graz Overview Introduction High level control of signal transformation

More information

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Digital Speech Processing- Lecture 14A Algorithms for Speech Processing Speech Processing Algorithms Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Single speech

More information

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,

More information

Chapter 2. Fourier Series & Fourier Transform. Updated:2/11/15

Chapter 2. Fourier Series & Fourier Transform. Updated:2/11/15 Chapter 2 Fourier Series & Fourier Transform Updated:2/11/15 Outline Systems and frequency domain representation Fourier Series and different representation of FS Fourier Transform and Spectra Power Spectral

More information

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE 1602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 8, NOVEMBER 2008 Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE Abstract

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

Pulsed S-Parameter Measurements using the ZVA network Analyzer

Pulsed S-Parameter Measurements using the ZVA network Analyzer Pulsed S-Parameter Measurements using the ZVA network Analyzer 1 Pulse Profile measurements ZVA Advanced Network Analyser 3 Motivation for Pulsed Measurements Typical Applications Avoid destruction of

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

Lecture Schedule: Week Date Lecture Title

Lecture Schedule: Week Date Lecture Title http://elec3004.org Sampling & More 2014 School of Information Technology and Electrical Engineering at The University of Queensland Lecture Schedule: Week Date Lecture Title 1 2-Mar Introduction 3-Mar

More information

Sound Synthesis Methods

Sound Synthesis Methods Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like

More information

Cepstrum alanysis of speech signals

Cepstrum alanysis of speech signals Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION by DARYUSH MEHTA B.S., Electrical Engineering (23) University of Florida SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING

More information

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech

Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 7, OCTOBER 2001 713 Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech Philip J. B. Jackson, Member,

More information

MAKE SOMETHING THAT TALKS?

MAKE SOMETHING THAT TALKS? MAKE SOMETHING THAT TALKS? Modeling the Human Vocal Tract pitch, timing, and formant control signals pitch, timing, and formant control signals lips, teeth, and tongue formant cavity 2 formant cavity 1

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

CHAPTER 3. ACOUSTIC MEASURES OF GLOTTAL CHARACTERISTICS 39 and from periodic glottal sources (Shadle, 1985; Stevens, 1993). The ratio of the amplitude of the harmonics at 3 khz to the noise amplitude in

More information

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER X. SPEECH ANALYSIS Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER Most vowel identifiers constructed in the past were designed on the principle of "pattern matching";

More information

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015 Final Exam Study Guide: 15-322 Introduction to Computer Music Course Staff April 24, 2015 This document is intended to help you identify and master the main concepts of 15-322, which is also what we intend

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Sound synthesis with Pure Data

Sound synthesis with Pure Data Sound synthesis with Pure Data 1. Start Pure Data from the programs menu in classroom TC307. You should get the following window: The DSP check box switches sound output on and off. Getting sound out First,

More information

Poles and Zeros of H(s), Analog Computers and Active Filters

Poles and Zeros of H(s), Analog Computers and Active Filters Poles and Zeros of H(s), Analog Computers and Active Filters Physics116A, Draft10/28/09 D. Pellett LRC Filter Poles and Zeros Pole structure same for all three functions (two poles) HR has two poles and

More information

Slovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova

Slovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova Slovak University of Technology and Planned Research in Voice De-Identification Anna Pribilova SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA the oldest and the largest university of technology in Slovakia

More information

Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components

Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components Geoffroy Peeters, avier Rodet To cite this version: Geoffroy Peeters, avier Rodet. Signal Characterization in terms of Sinusoidal

More information

6.976 High Speed Communication Circuits and Systems Lecture 17 Advanced Frequency Synthesizers

6.976 High Speed Communication Circuits and Systems Lecture 17 Advanced Frequency Synthesizers 6.976 High Speed Communication Circuits and Systems Lecture 17 Advanced Frequency Synthesizers Michael Perrott Massachusetts Institute of Technology Copyright 2003 by Michael H. Perrott Bandwidth Constraints

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

A perceptually and physiologically motivated voice source model

A perceptually and physiologically motivated voice source model INTERSPEECH 23 A perceptually and physiologically motivated voice source model Gang Chen, Marc Garellek 2,3, Jody Kreiman 3, Bruce R. Gerratt 3, Abeer Alwan Department of Electrical Engineering, University

More information

Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback

Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback PURPOSE This lab will introduce you to the laboratory equipment and the software that allows you to link your computer to the hardware.

More information

Envelope Modulation Spectrum (EMS)

Envelope Modulation Spectrum (EMS) Envelope Modulation Spectrum (EMS) The Envelope Modulation Spectrum (EMS) is a representation of the slow amplitude modulations in a signal and the distribution of energy in the amplitude fluctuations

More information

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Proceedings of the 2 nd International Conference on Current Trends in Engineering and Management ICCTEM -214 ISSN

More information

Fundamental Frequency Detection

Fundamental Frequency Detection Fundamental Frequency Detection Jan Černocký, Valentina Hubeika {cernocky ihubeika}@fit.vutbr.cz DCGM FIT BUT Brno Fundamental Frequency Detection Jan Černocký, Valentina Hubeika, DCGM FIT BUT Brno 1/37

More information

3D Distortion Measurement (DIS)

3D Distortion Measurement (DIS) 3D Distortion Measurement (DIS) Module of the R&D SYSTEM S4 FEATURES Voltage and frequency sweep Steady-state measurement Single-tone or two-tone excitation signal DC-component, magnitude and phase of

More information

Hungarian Speech Synthesis Using a Phase Exact HNM Approach

Hungarian Speech Synthesis Using a Phase Exact HNM Approach Hungarian Speech Synthesis Using a Phase Exact HNM Approach Kornél Kovács 1, András Kocsor 2, and László Tóth 3 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University

More information

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

The source-filter model of speech production"

The source-filter model of speech production 24.915/24.963! Linguistic Phonetics! The source-filter model of speech production" Glottal airflow Output from lips 400 200 0.1 0.2 0.3 Time (in secs) 30 20 10 0 0 1000 2000 3000 Frequency (Hz) Source

More information

Subtractive Synthesis & Formant Synthesis

Subtractive Synthesis & Formant Synthesis Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/

More information