Location of sound source and transfer functions

Size: px
Start display at page:

Download "Location of sound source and transfer functions"

Transcription

1 Location of sound source and transfer functions Sounds produced with source at the larynx either voiced or voiceless (aspiration) sound is filtered by entire vocal tract Transfer function is well modeled by multiple resonances in cascade. Transfer function = product of transfer function of individual resonances. Number of resonances (below Nyquist frequency) is roughly the same regardless of tube shape, determined by length of tube

2 Noise sources within the vocal tract fricatives, stop releases Will effectively only be filtered by cavity anterior to source There will be fewer resonances of that (smaller) tube below the Nyquist frequency than of the vocal tract as a whole. The main resonance of this cavity will often be similar to one of the full vocal tract resonances This can be modeled by filtering the noise source through all of the vocal tract resonances in parallel, and setting specific amplitudes for each resonance. Vocal tract resonances that correspond to resonances of the cavity anterior to the noise source will have high amplitude, others will have zero amplitude.

3 Cascade vs. Parallel Formant synthesis

4 Klatt (1980) synthesizer F0 xx VOICING SOURCE NOISE SOURCE FIRST DIFF. RADIATION CHARACTERISTIC PARALLEL VOCAL TRACT TRANSFER FUNCTION

5 syn4 New Parameters AF Amplitude of noise A2 Amplitude parallel F2 A3 Amplitude parallel F3 A3 Amplitude parallel F4 A4 Amplitude parallel F5 A5 Amplitude parallel F6 (=4900) AB Amplitude Bypass (no filtering)

6 syn4.m function signal = syn4 (srate,frame_dur,nf,ftable) % synthesize.m % Louis Goldstein % November 2009 % formant synthesizer % usage: % [out, t] = syn3 (srate,frame_dur,ftable) % % input arguments: % srate sampling rate (in Hz) % f0 fundamental frequency (in Hz) % frame_dur duration of each frame in milliseconds % Ftable character string containing filename of F table % Row 1: AV % Row 2: f0 % Row 3: AH % Row 4 to Row 4+nf-1: formant frequencies % Row 4+nf to Row 4+2*nf-1: formant bandwiths % Row 4+2*nf: AF % then rows for A2-A6, then AB % %returned arguments: % signal vector with synthesized waveform samples

7 syn4.m % location of parameters in table iav = 1; if0 = 2; iah =3; if1 = 4; ib1 = if1+nf; iaf = ib1+nf; AV_gain = 100; % voiced gain factor AH_gain =.01; % voiceless gain factor AF_gain =.01; Again = [ ]; ABgain =.1; FBW = get_fbw(ftable); nframes = size(fbw,2); dur = nframes * (frame_dur / 1000 );% duration in seconds samps_per_frame = floor(srate * (frame_dur / 1000));

8 syn4.m % generate sources % voiced source f0 = FBW(iF0,:); AV = FBW(iAV,:)*AV_gain; [voiced, mod_pulse] = make_pulses(f0, srate, frame_dur, AV); nframes = min ([floor(length(voiced)./ samps_per_frame) nframes]); tot_samples = nframes * samps_per_frame; voiced = voiced(1:tot_samples); RG = 0; % RG is the frequency of the Glottal Resonator BWG = 100; % BWG is the bandwidth of the Glottal Resonator [b_glo,a_glo]=resonance(srate,rg,bwg); % filter impulse train thru low-pass filter % to get approximation to shape of glottal pulse voiced=filter(b_glo, a_glo, voiced);

9 syn4.m % noise source AH = FBW(iAH,1:nframes)*AH_gain; noise = randn(1, tot_samples); % Gaussian noise % calculate velocity source from pressure source noise = filter ([.5.5], 1, noise); mod_pulse = mod_pulse (1: tot_samples); noise = noise.* mod_pulse; AH_int = interp(ah, samps_per_frame); AH_int = AH_int(1:tot_samples); % compute composite source in = voiced + (noise.* AH_int);

10 syn4.m % filter successive frames of source through VT cascade for i = nf:-1:1 beg_sample = 1; z = []; for iframe = 1:nframes F = FBW (if1:if1+nf-1, iframe); BW = FBW (ib1:ib1+nf-1, iframe); [b,a]=resonance(srate,f(i),bw(i)); [out,z] = filter(b,a,in(beg_sample:beg_sample+ samps_per_frame-1),z); in(beg_sample:beg_sample+samps_per_frame-1) = out; beg_sample = beg_sample+samps_per_frame; end end signal = in(1: nframes*samps_per_frame);

11 % parallel noise branch: do only if data found in appropriate rows in file if size(fbw,1) >= iaf % filter thru formants in parallel syn4.m % start with F2 and go to FN+1 noise_in = AF_gain * noise; noise_out = zeros(1, length(signal)); for i = 2:nf+1 beg_sample = 1; z = []; f_out = []; for iframe = 1:nframes if i <= nf F = FBW (if1+i-1, iframe); BW = FBW (ib1+i-1, iframe); else F = (srate/2) - 100; BW = 100; end [b,a]=resonance(srate,f,bw); [out,z] = filter(b,a,noise_in(beg_sample:beg_sample+samps_per_frame-1),z); % FAmp factor = A(i) * Again(i) * AF Famp = FBW(iAF+i-1, iframe).* Again(i).* FBW(iAF, iframe); f_out = [f_out out.*famp]; beg_sample = beg_sample+samps_per_frame; end noise_out = noise_out + f_out; end

12 syn4.m % Bypass the formant resonators for noise produced at the lips beg_sample = 1; By_out = []; for iframe = 1:nframes Famp = FBW(iAF+nf+1, iframe).* ABgain.* FBW(iAF, iframe); By_out = [By_out noise_in(beg_sample:beg_sample+samps_per_frame-1).*famp]; beg_sample = beg_sample+samps_per_frame; end % Add parallel noise output and Bypass output to cascade output signal = signal + noise_out + By_out; end % filter through high pass radiation filter and filter % this calculates volume velocity at a distance from the the mouth, signal = filter([1-1],1,signal);

13 syn4.m soundsc (signal, srate); %plot the values of F1-F4 as function of frame in the upper panel %plot the synthesized signal as a function of t in ms in the lower panel figure (1) frames = 1:nframes; subplot (2,1,1), plot (frames, FBW(iF1:iF1+3, :),'-o') xlabel ('Frame No.') make_spect2(signal', srate,6); %subplot (2,1,2), plot ([1:length(signal)]*1000/srate,signal); xlabel ('Time in milliseconds');

14 asa.txt F AV F AH F from analysis of /ada/ F F F F B B B B B AF A A A A A AB

15 asa output

16 Voiced Fricatives Employ voicing (AV) and noise (AF) sources. Amplitude of noise source should be modulated by laryngeal source Noise is mostly restricted to open phase of glottal cycle. Approximate by setting noise source to 0 during half of glottal cycle.

17 make_pulses function [pulses, mod] = make_pulses(f0, srate, frame_dur,av); % % Louis Goldstein % November 2009 % calculate sequence of impulses based on an f0 vector % % Input parameters % f0 vector of f0 values % srate sampling rate (Hz) % frame_dur duration of each f0 frame (corresponds to slide in get_f0) % AV vector of voicing amplitudes frame_length = floor(frame_dur * srate / 1000); % frame length in samples length_f0 = length(f0); % interpolate f0 so it has a value for every sample and scale in cycles/sample cont_freq = interp(f0/srate, frame_length); cont_av = interp(av,frame_length); % calculate elapsed cycles for every sample elapsed_cycles = cumsum(cont_freq); %calculate percentage way through current cycle cycle_percent = rem(elapsed_cycles,1); mod = double(cycle_percent<.5); mod(cont_av == 0) = 1; % set mod to 1 for first half of each voiced cycle % set mod to 1 if if AV>0 shift = [0 cycle_percent(1:end-1)]; % set pulses (1s) and 0s elsewhere pulses = cycle_percent<shift; % will be true only when cycle boundary is crossed pulses = cont_av.* double(pulses);

18 aza.txt from analysis of /ada/ F AV F AH F F F F F B B B B B AF A A A A A AB

19 aza output

20 afa.txt from analysis of /aba/ F AV F AH F F F F F B B B B B AF A A A A A AB

21 afa output

22 Stop Releases

23 Frication at release short fricative excites same resonators as homorganic fricative

24 ata_burst.txt F AV F AH F F F F F B B B B B AF A A A A A AB

25 ata_burst output

26 ata.txt F AV F AH F F F F F B B B B B

27 ata output

28 new version of ftime2 plot short-time spectrum (6 ms) of each frame superimposed on LPC transfer function estimate. plot (freq(1:l), 10*log(abs(h(1:L)))+170) grid xlabel ([' Frame number is:' num2str(iframe)]) % make F, BW vectors to display in title for i = 1:5 Fdisp(i) = fix(f(i, iframe)); BWdisp(i) = fix(bw(i, iframe)); end title (['Frame: ', num2str(iframe), ' F:' num2str(fdisp) '; num2str(bwdisp) '; ' 'f0:' num2str(fix(f0(iframe)))]); ' 'BW:' beg_sample = 1+(iframe-1)*samps_per_frame; nsamps = floor((winsize/1000)*sr); % winsize = 6ms for spectrum hold on spectrum (signal(beg_sample:beg_sample+nsamps-1), sr, 1024); grid hold off

29 /a/ in ada release burst in ada

30 Stop Releases

31 Transfer Functions: For source at constriction

32 Labial Releases

33 Coronals Dorsals

34 Spectral shape and place of articulation Stevens & Blumstein (1981): Overall spectral shape at stop release tends to be invariant across Vs, even though resonances are not. Labial Dorsal Coronal

35 Klatt (1987) Labial Coronal Dorsal Front Back Back Rounded

36 For synthesis, we need to set the amplitudes of particular frequencies for burst TABLE HI. Parameter values for the synthesis of selected components of English consonants before front vowels (see text for source amplitude values). Labials: no anterior cavity, so only set AB. Sonor F B1 B2 B3 Coronals: mostly A5-A6 Dorsals: differ by vowel context [w] [y] [ [1] Klatt (1980) Front V context Fric. F1 /72 F3 B1 B2 B3 A2 A3 A4 A5 A6 AB [fl [v] [0] [ ] [s] [z] [ ] Affricate Plosive [ ] [j] [p] [b] [fl [d] [k] [g]

Nature of Noise source. soundsc (noise, 10000);

Nature of Noise source. soundsc (noise, 10000); Noise Sources Voiceless aspiration can be produced with a noise source at the glottis. (also for voiceless sonorants, including vowels) Noise source that is filtered through VT cascade, so some resonance

More information

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Speech Synthesis Spring,1999 Lecture 23 N.MORGAN

More information

The source-filter model of speech production"

The source-filter model of speech production 24.915/24.963! Linguistic Phonetics! The source-filter model of speech production" Glottal airflow Output from lips 400 200 0.1 0.2 0.3 Time (in secs) 30 20 10 0 0 1000 2000 3000 Frequency (Hz) Source

More information

Source-filter Analysis of Consonants: Nasals and Laterals

Source-filter Analysis of Consonants: Nasals and Laterals L105/205 Phonetics Scarborough Handout 11 Nov. 3, 2005 reading: Johnson Ch. 9 (today); Pickett Ch. 5 (Tues.) Source-filter Analysis of Consonants: Nasals and Laterals 1. Both nasals and laterals have voicing

More information

Source-filter analysis of fricatives

Source-filter analysis of fricatives 24.915/24.963 Linguistic Phonetics Source-filter analysis of fricatives Figure removed due to copyright restrictions. Readings: Johnson chapter 5 (speech perception) 24.963: Fujimura et al (1978) Noise

More information

SPEECH AND SPECTRAL ANALYSIS

SPEECH AND SPECTRAL ANALYSIS SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs

More information

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 1. Resonators and Filters INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 Different vibrating objects are tuned to specific frequencies; these frequencies at which a particular

More information

COMP 546, Winter 2017 lecture 20 - sound 2

COMP 546, Winter 2017 lecture 20 - sound 2 Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering

More information

Assignment 8: Tube Resonances

Assignment 8: Tube Resonances Linguistics 582 Basics of Digital Signal Processing Assignment 8: Tube Resonances Reading: Stevens, K. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3-45. Read pp. 3-20. ONLY. Johnson,

More information

Assignment 7: Tube Resonances

Assignment 7: Tube Resonances Linguistics 582 Basics of Digital Signal Processing Reading: Assignment 7: Tube Resonances Stevens, K. (1989). On the quantal nature of speech. Journal of Phonetics, 17, 3-45. Read pp. 3-20. ONLY. Johnson,

More information

Review: Frequency Response Graph. Introduction to Speech and Science. Review: Vowels. Response Graph. Review: Acoustic tube models

Review: Frequency Response Graph. Introduction to Speech and Science. Review: Vowels. Response Graph. Review: Acoustic tube models eview: requency esponse Graph Introduction to Speech and Science Lecture 5 ricatives and Spectrograms requency Domain Description Input Signal System Output Signal Output = Input esponse? eview: requency

More information

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels A complex sound with particular frequency can be analyzed and quantified by its Fourier spectrum: the relative amplitudes

More information

Chapter 3. Description of the Cascade/Parallel Formant Synthesizer. 3.1 Overview

Chapter 3. Description of the Cascade/Parallel Formant Synthesizer. 3.1 Overview Chapter 3 Description of the Cascade/Parallel Formant Synthesizer The Klattalk system uses the KLSYN88 cascade-~arallel formant synthesizer that was first described in Klatt and Klatt (1990). This speech

More information

Subtractive Synthesis & Formant Synthesis

Subtractive Synthesis & Formant Synthesis Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Proceedings of the 2 nd International Conference on Current Trends in Engineering and Management ICCTEM -214 ISSN

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

A() I I X=t,~ X=XI, X=O

A() I I X=t,~ X=XI, X=O 6 541J Handout T l - Pert r tt Ofl 11 (fo 2/19/4 A() al -FA ' AF2 \ / +\ X=t,~ X=X, X=O, AF3 n +\ A V V V x=-l x=o Figure 3.19 Curves showing the relative magnitude and direction of the shift AFn in formant

More information

Acoustic Phonetics. How speech sounds are physically represented. Chapters 12 and 13

Acoustic Phonetics. How speech sounds are physically represented. Chapters 12 and 13 Acoustic Phonetics How speech sounds are physically represented Chapters 12 and 13 1 Sound Energy Travels through a medium to reach the ear Compression waves 2 Information from Phonetics for Dummies. William

More information

Resonance and resonators

Resonance and resonators Resonance and resonators Dr. Christian DiCanio cdicanio@buffalo.edu University at Buffalo 10/13/15 DiCanio (UB) Resonance 10/13/15 1 / 27 Harmonics Harmonics and Resonance An example... Suppose you are

More information

An Implementation of the Klatt Speech Synthesiser*

An Implementation of the Klatt Speech Synthesiser* REVISTA DO DETUA, VOL. 2, Nº 1, SETEMBRO 1997 1 An Implementation of the Klatt Speech Synthesiser* Luis Miguel Teixeira de Jesus, Francisco Vaz, José Carlos Principe Resumo - Neste trabalho descreve-se

More information

Statistical NLP Spring Unsupervised Tagging?

Statistical NLP Spring Unsupervised Tagging? Statistical NLP Spring 2008 Lecture 9: Speech Signal Dan Klein UC Berkeley Unsupervised Tagging? AKA part-of-speech induction Task: Raw sentences in Tagged sentences out Obvious thing to do: Start with

More information

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8 WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels See Rogers chapter 7 8 Allows us to see Waveform Spectrogram (color or gray) Spectral section short-time spectrum = spectrum of a brief

More information

Foundations of Language Science and Technology. Acoustic Phonetics 1: Resonances and formants

Foundations of Language Science and Technology. Acoustic Phonetics 1: Resonances and formants Foundations of Language Science and Technology Acoustic Phonetics 1: Resonances and formants Jan 19, 2015 Bernd Möbius FR 4.7, Phonetics Saarland University Speech waveforms and spectrograms A f t Formants

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

CS 188: Artificial Intelligence Spring Speech in an Hour

CS 188: Artificial Intelligence Spring Speech in an Hour CS 188: Artificial Intelligence Spring 2006 Lecture 19: Speech Recognition 3/23/2006 Dan Klein UC Berkeley Many slides from Dan Jurafsky Speech in an Hour Speech input is an acoustic wave form s p ee ch

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Source-Filter Theory 1

Source-Filter Theory 1 Source-Filter Theory 1 Vocal tract as sound production device Sound production by the vocal tract can be understood by analogy to a wind or brass instrument. sound generation sound shaping (or filtering)

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #27 Tuesday, November 11, 23 6. SPECTRAL ANALYSIS AND ESTIMATION 6.1 Introduction to Spectral Analysis and Estimation The discrete-time Fourier

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

Digitized signals. Notes on the perils of low sample resolution and inappropriate sampling rates.

Digitized signals. Notes on the perils of low sample resolution and inappropriate sampling rates. Digitized signals Notes on the perils of low sample resolution and inappropriate sampling rates. 1 Analog to Digital Conversion Sampling an analog waveform Sample = measurement of waveform amplitude at

More information

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals. XIV. SPEECH COMMUNICATION Prof. M. Halle G. W. Hughes J. M. Heinz Prof. K. N. Stevens Jane B. Arnold C. I. Malme Dr. T. T. Sandel P. T. Brady F. Poza C. G. Bell O. Fujimura G. Rosen A. AUTOMATIC RESOLUTION

More information

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II 1 Musical Acoustics Lecture 14 Timbre / Tone quality II Odd vs Even Harmonics and Symmetry Sines are Anti-symmetric about mid-point If you mirror around the middle you get the same shape but upside down

More information

Converting Speaking Voice into Singing Voice

Converting Speaking Voice into Singing Voice Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

About waves. Sounds of English. Different types of waves. Ever done the wave?? Why do we care? Tuning forks and pendulums

About waves. Sounds of English. Different types of waves. Ever done the wave?? Why do we care? Tuning forks and pendulums bout waves Sounds of English Topic 7 The acoustics of speech: Sound Waves Lots of examples in the world around us! an take all sorts of different forms Definition: disturbance that travels through a medium

More information

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION by DARYUSH MEHTA B.S., Electrical Engineering (23) University of Florida SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING

More information

EE 225D LECTURE ON SYNTHETIC AUDIO. University of California Berkeley

EE 225D LECTURE ON SYNTHETIC AUDIO. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Synthetic Audio Spring,1999 Lecture 2 N.MORGAN

More information

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 Speech Processing Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 All course materials and handouts are the same for both versions. Differences: credits (20 for UG, 10 for PG);

More information

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical

More information

Subtractive Synthesis. Describing a Filter. Filters. CMPT 468: Subtractive Synthesis

Subtractive Synthesis. Describing a Filter. Filters. CMPT 468: Subtractive Synthesis Subtractive Synthesis CMPT 468: Subtractive Synthesis Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November, 23 Additive synthesis involves building the sound by

More information

MAKE SOMETHING THAT TALKS?

MAKE SOMETHING THAT TALKS? MAKE SOMETHING THAT TALKS? Modeling the Human Vocal Tract pitch, timing, and formant control signals pitch, timing, and formant control signals lips, teeth, and tongue formant cavity 2 formant cavity 1

More information

Speech Perception Speech Analysis Project. Record 3 tokens of each of the 15 vowels of American English in bvd or hvd context.

Speech Perception Speech Analysis Project. Record 3 tokens of each of the 15 vowels of American English in bvd or hvd context. Speech Perception Map your vowel space. Record tokens of the 15 vowels of English. Using LPC and measurements on the waveform and spectrum, determine F0, F1, F2, F3, and F4 at 3 points in each token plus

More information

Synthesis Algorithms and Validation

Synthesis Algorithms and Validation Chapter 5 Synthesis Algorithms and Validation An essential step in the study of pathological voices is re-synthesis; clear and immediate evidence of the success and accuracy of modeling efforts is provided

More information

Quarterly Progress and Status Report. A note on the vocal tract wall impedance

Quarterly Progress and Status Report. A note on the vocal tract wall impedance Dept. for Speech, Music and Hearing Quarterly Progress and Status Report A note on the vocal tract wall impedance Fant, G. and Nord, L. and Branderud, P. journal: STL-QPSR volume: 17 number: 4 year: 1976

More information

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied

More information

Mask-Based Nasometry A New Method for the Measurement of Nasalance

Mask-Based Nasometry A New Method for the Measurement of Nasalance Publications of Dr. Martin Rothenberg: Mask-Based Nasometry A New Method for the Measurement of Nasalance ABSTRACT The term nasalance has been proposed by Fletcher and his associates (Fletcher and Frost,

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

GLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES

GLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES Clemson University TigerPrints All Dissertations Dissertations 5-2012 GLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES Yiqiao Chen Clemson University, rls_lms@yahoo.com

More information

Complex Sounds. Reading: Yost Ch. 4

Complex Sounds. Reading: Yost Ch. 4 Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Acoustic Phonetics. Chapter 8

Acoustic Phonetics. Chapter 8 Acoustic Phonetics Chapter 8 1 1. Sound waves Vocal folds/cords: Frequency: 300 Hz 0 0 0.01 0.02 0.03 2 1.1 Sound waves: The parts of waves We will be considering the parts of a wave with the wave represented

More information

Acoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018

Acoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018 1 Acoustics and Fourier Transform Physics 3600 - Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018 I. INTRODUCTION Time is fundamental in our everyday life in the 4-dimensional

More information

Linguistics 401 LECTURE #2. BASIC ACOUSTIC CONCEPTS (A review)

Linguistics 401 LECTURE #2. BASIC ACOUSTIC CONCEPTS (A review) Linguistics 401 LECTURE #2 BASIC ACOUSTIC CONCEPTS (A review) Unit of wave: CYCLE one complete wave (=one complete crest and trough) The number of cycles per second: FREQUENCY cycles per second (cps) =

More information

Filters. Signals are sequences of numbers. Simple algebraic operations on signals can perform useful functions: shifting multiplication addition

Filters. Signals are sequences of numbers. Simple algebraic operations on signals can perform useful functions: shifting multiplication addition Filters Signals are sequences of numbers. Simple algebraic operations on signals can perform useful functions: shifting multiplication addition Simple Example... Smooth points to better reveal trend X

More information

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John

More information

INTRODUCTION TO COMPUTER MUSIC. Roger B. Dannenberg Professor of Computer Science, Art, and Music. Copyright by Roger B.

INTRODUCTION TO COMPUTER MUSIC. Roger B. Dannenberg Professor of Computer Science, Art, and Music. Copyright by Roger B. INTRODUCTION TO COMPUTER MUSIC FM SYNTHESIS A classic synthesis algorithm Roger B. Dannenberg Professor of Computer Science, Art, and Music ICM Week 4 Copyright 2002-2013 by Roger B. Dannenberg 1 Frequency

More information

Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics

Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics Derek Tze Wei Chu and Kaiwen Li School of Physics, University of New South Wales, Sydney,

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER X. SPEECH ANALYSIS Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER Most vowel identifiers constructed in the past were designed on the principle of "pattern matching";

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

A Theoretically. Synthesis of Nasal Consonants: Based Approach. Andrew Ian Russell

A Theoretically. Synthesis of Nasal Consonants: Based Approach. Andrew Ian Russell Synthesis of Nasal Consonants: Based Approach by Andrew Ian Russell A Theoretically Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements

More information

Lab 9 Fourier Synthesis and Analysis

Lab 9 Fourier Synthesis and Analysis Lab 9 Fourier Synthesis and Analysis In this lab you will use a number of electronic instruments to explore Fourier synthesis and analysis. As you know, any periodic waveform can be represented by a sum

More information

SYNTHESIS' OF STOPS, FRICATIVES, LIQUIDS AND VOWELS BY A COMPUTER CONTROLLED ELECTRONIC VOCAL TRACT ANALOG. ' b y KENNETH A.

SYNTHESIS' OF STOPS, FRICATIVES, LIQUIDS AND VOWELS BY A COMPUTER CONTROLLED ELECTRONIC VOCAL TRACT ANALOG. ' b y KENNETH A. SYNTHESIS' OF STOPS, FRICATIVES, LIQUIDS AND VOWELS BY A COMPUTER CONTROLLED ELECTRONIC VOCAL TRACT ANALOG ' b y KENNETH A. SPENCER B.A.Sc, University of British Columbia, 1967 A THESIS SUBMITTED IN PARTIAL

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

-voiced. +voiced. /z/ /s/ Last Lecture. Digital Speech Processing. Overview of Speech Processing. Example on Sound Source Feature

-voiced. +voiced. /z/ /s/ Last Lecture. Digital Speech Processing. Overview of Speech Processing. Example on Sound Source Feature ENEE408G Lecture-6 Digital Speech rocessing URL: http://www.ece.umd.edu/class/enee408g/ Slides included here are based on Spring 005 offering in the order of introduction, image, video, speech, and audio.

More information

A Look at Un-Electronic Musical Instruments

A Look at Un-Electronic Musical Instruments A Look at Un-Electronic Musical Instruments A little later in the course we will be looking at the problem of how to construct an electrical model, or analog, of an acoustical musical instrument. To prepare

More information

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of COMPRESSIVE SAMPLING OF SPEECH SIGNALS by Mona Hussein Ramadan BS, Sebha University, 25 Submitted to the Graduate Faculty of Swanson School of Engineering in partial fulfillment of the requirements for

More information

Improving Sound Quality by Bandwidth Extension

Improving Sound Quality by Bandwidth Extension International Journal of Scientific & Engineering Research, Volume 3, Issue 9, September-212 Improving Sound Quality by Bandwidth Extension M. Pradeepa, M.Tech, Assistant Professor Abstract - In recent

More information

Epoch Extraction From Emotional Speech

Epoch Extraction From Emotional Speech Epoch Extraction From al Speech D Govind and S R M Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati Email:{dgovind,prasanna}@iitg.ernet.in Abstract

More information

Pitch Period of Speech Signals Preface, Determination and Transformation

Pitch Period of Speech Signals Preface, Determination and Transformation Pitch Period of Speech Signals Preface, Determination and Transformation Mohammad Hossein Saeidinezhad 1, Bahareh Karamsichani 2, Ehsan Movahedi 3 1 Islamic Azad university, Najafabad Branch, Saidinezhad@yahoo.com

More information

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Digital Speech Processing- Lecture 14A Algorithms for Speech Processing Speech Processing Algorithms Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Single speech

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

HCS / ACN 6389 Speech Perception Lab

HCS / ACN 6389 Speech Perception Lab HCS / ACN 6389 Speech Perception Lab Course Requirements Matlab problems & lab assignments (40%) Oral presentations (10%) Term project paper (50%) Dr. Peter Assmann Fall 2017 2 Term project: important

More information

On the glottal flow derivative waveform and its properties

On the glottal flow derivative waveform and its properties COMPUTER SCIENCE DEPARTMENT UNIVERSITY OF CRETE On the glottal flow derivative waveform and its properties A time/frequency study George P. Kafentzis Bachelor s Dissertation 29/2/2008 Supervisor: Yannis

More information

Voiced/nonvoiced detection based on robustness of voiced epochs

Voiced/nonvoiced detection based on robustness of voiced epochs Voiced/nonvoiced detection based on robustness of voiced epochs by N. Dhananjaya, B.Yegnanarayana in IEEE Signal Processing Letters, 17, 3 : 273-276 Report No: IIIT/TR/2010/50 Centre for Language Technologies

More information

Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask

Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Acoustic properties of the Rothenberg mask Hertegård, S. and Gauffin, J. journal: STL-QPSR volume: 33 number: 2-3 year: 1992 pages:

More information

Airflow visualization in a model of human glottis near the self-oscillating vocal folds model

Airflow visualization in a model of human glottis near the self-oscillating vocal folds model Applied and Computational Mechanics 5 (2011) 21 28 Airflow visualization in a model of human glottis near the self-oscillating vocal folds model J. Horáček a,, V. Uruba a,v.radolf a, J. Veselý a,v.bula

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Quarterly Progress and Status Report. Speech synthesizer control by smoothed step functions

Quarterly Progress and Status Report. Speech synthesizer control by smoothed step functions Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Speech synthesizer control by smoothed step functions Liljencrants, J. journal: STL-QPSR volume: 10 number: 4 year: 1969 pages:

More information

CMPT 468: Frequency Modulation (FM) Synthesis

CMPT 468: Frequency Modulation (FM) Synthesis CMPT 468: Frequency Modulation (FM) Synthesis Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University October 6, 23 Linear Frequency Modulation (FM) Till now we ve seen signals

More information

Transforming High-Effort Voices Into Breathy Voices Using Adaptive Pre-Emphasis Linear Prediction

Transforming High-Effort Voices Into Breathy Voices Using Adaptive Pre-Emphasis Linear Prediction Transforming High-Effort Voices Into Breathy Voices Using Adaptive Pre-Emphasis Linear Prediction by Karl Ingram Nordstrom B.Eng., University of Victoria, 1995 M.A.Sc., University of Victoria, 2000 A Dissertation

More information

CHAPTER 3. ACOUSTIC MEASURES OF GLOTTAL CHARACTERISTICS 39 and from periodic glottal sources (Shadle, 1985; Stevens, 1993). The ratio of the amplitude of the harmonics at 3 khz to the noise amplitude in

More information

Wideband Speech Coding & Its Application

Wideband Speech Coding & Its Application Wideband Speech Coding & Its Application Apeksha B. landge. M.E. [student] Aditya Engineering College Beed Prof. Amir Lodhi. Guide & HOD, Aditya Engineering College Beed ABSTRACT: Increasing the bandwidth

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

A Physiologically Produced Impulsive UWB signal: Speech

A Physiologically Produced Impulsive UWB signal: Speech A Physiologically Produced Impulsive UWB signal: Speech Maria-Gabriella Di Benedetto University of Rome La Sapienza Faculty of Engineering Rome, Italy gaby@acts.ing.uniroma1.it http://acts.ing.uniroma1.it

More information

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE 1602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 8, NOVEMBER 2008 Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE Abstract

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

the 99th Convention 1995 October 6-9 NewYork

the 99th Convention 1995 October 6-9 NewYork Tunable Bandpass Filters in Music Synthesis 4098 (L-2) Robert C. Maher University of Nebraska-Lincoln Lincoln, NE 68588-0511, USA Presented at the 99th Convention 1995 October 6-9 NewYork ^ ud,o Thispreprinthas

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

SPEECH ANALYSIS* Prof. M. Halle G. W. Hughes A. R. Adolph

SPEECH ANALYSIS* Prof. M. Halle G. W. Hughes A. R. Adolph XII. SPEECH ANALYSIS* Prof. M. Halle G. W. Hughes A. R. Adolph A. STUDIES OF PITCH PERIODICITY In the past a number of devices have been built to extract pitch-period information from speech. These efforts

More information

DSP First. Laboratory Exercise #2. Introduction to Complex Exponentials

DSP First. Laboratory Exercise #2. Introduction to Complex Exponentials DSP First Laboratory Exercise #2 Introduction to Complex Exponentials The goal of this laboratory is gain familiarity with complex numbers and their use in representing sinusoidal signals as complex exponentials.

More information

HMM-based Speech Synthesis Using an Acoustic Glottal Source Model

HMM-based Speech Synthesis Using an Acoustic Glottal Source Model HMM-based Speech Synthesis Using an Acoustic Glottal Source Model João Paulo Serrasqueiro Robalo Cabral E H U N I V E R S I T Y T O H F R G E D I N B U Doctor of Philosophy The Centre for Speech Technology

More information

Waveshaping Synthesis. Indexing. Waveshaper. CMPT 468: Waveshaping Synthesis

Waveshaping Synthesis. Indexing. Waveshaper. CMPT 468: Waveshaping Synthesis Waveshaping Synthesis CMPT 468: Waveshaping Synthesis Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University October 8, 23 In waveshaping, it is possible to change the spectrum

More information