Nature of Noise source. soundsc (noise, 10000);

Similar documents
Location of sound source and transfer functions

COMP 546, Winter 2017 lecture 20 - sound 2

Linguistic Phonetics. Spectral Analysis

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley

The source-filter model of speech production"

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006

Source-filter Analysis of Consonants: Nasals and Laterals

Digitized signals. Notes on the perils of low sample resolution and inappropriate sampling rates.

SPEECH AND SPECTRAL ANALYSIS

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

About waves. Sounds of English. Different types of waves. Ever done the wave?? Why do we care? Tuning forks and pendulums

On the glottal flow derivative waveform and its properties

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey

Synthesis Algorithms and Validation

Assignment 8: Tube Resonances

CS 188: Artificial Intelligence Spring Speech in an Hour

Speech Perception Speech Analysis Project. Record 3 tokens of each of the 15 vowels of American English in bvd or hvd context.

Assignment 7: Tube Resonances

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Foundations of Language Science and Technology. Acoustic Phonetics 1: Resonances and formants

Digital Speech Processing and Coding

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels

Subtractive Synthesis & Formant Synthesis

Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE

Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics

Filters. Signals are sequences of numbers. Simple algebraic operations on signals can perform useful functions: shifting multiplication addition

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Linguistics 401 LECTURE #2. BASIC ACOUSTIC CONCEPTS (A review)

An Implementation of the Klatt Speech Synthesiser*

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Source-filter analysis of fricatives

Digital Signal Processing

Statistical NLP Spring Unsupervised Tagging?

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA

Acoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018

EE 5410 Signal Processing

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

Analysis and Synthesis of Pathological Vowels

Acoustic Phonetics. How speech sounds are physically represented. Chapters 12 and 13

Complex Sounds. Reading: Yost Ch. 4

EE482: Digital Signal Processing Applications

CMPT 468: Frequency Modulation (FM) Synthesis

Basic Characteristics of Speech Signal Analysis

OFDM Signal Modulation Application Plug-in Programmer Manual

A Physiologically Produced Impulsive UWB signal: Speech

VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL

SPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester

Parameterization of the glottal source with the phase plane plot

Communications Theory and Engineering

STANFORD UNIVERSITY. DEPARTMENT of ELECTRICAL ENGINEERING. EE 102B Spring 2013 Lab #05: Generating DTMF Signals

Voiced/nonvoiced detection based on robustness of voiced epochs

Glottal source model selection for stationary singing-voice by low-band envelope matching

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.

Review: Frequency Response Graph. Introduction to Speech and Science. Review: Vowels. Response Graph. Review: Acoustic tube models

Chapter 3. Description of the Cascade/Parallel Formant Synthesizer. 3.1 Overview

Pitch Period of Speech Signals Preface, Determination and Transformation

Wireless Communication Systems Laboratory Lab#1: An introduction to basic digital baseband communication through MATLAB simulation Objective

Speech Synthesis; Pitch Detection and Vocoders

Mask-Based Nasometry A New Method for the Measurement of Nasalance

APPENDIX K PULSE AMPLITUDE MODULATION STANDARDS. Paragraph Title Page

The Channel Vocoder (analyzer):

ScienceDirect. Accuracy of Jitter and Shimmer Measurements

Transforming High-Effort Voices Into Breathy Voices Using Adaptive Pre-Emphasis Linear Prediction

Filling in the MIMO Matrix Part 2 Time Waveform Replication Tests Using Field Data

Acoustic Phonetics. Chapter 8

Data and Computer Communications. Chapter 3 Data Transmission

Contents. Sevana Voice Quality Analyzer Copyright (c) 2009 by Sevana Oy, Finland. All rights reserved.

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Linear Frequency Modulation (FM) Chirp Signal. Chirp Signal cont. CMPT 468: Lecture 7 Frequency Modulation (FM) Synthesis

The quality of the transmission signal The characteristics of the transmission medium. Some type of transmission medium is required for transmission:

Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech

Source-Filter Theory 1

A Manual of TransShiftMex

Exam 3--PHYS 151--Chapter 4--S14

A() I I X=t,~ X=XI, X=O

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

SOURCE-filter modeling of speech is based on exciting. Glottal Spectral Separation for Speech Synthesis

3A: PROPERTIES OF WAVES

Signals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2

From Ladefoged EAP, p. 11

SGN Audio and Speech Processing

Lecture Fundamentals of Data and signals

the 99th Convention 1995 October 6-9 NewYork


Introducing COVAREP: A collaborative voice analysis repository for speech technologies

A Look at Un-Electronic Musical Instruments

APPENDIX K. Pulse Amplitude Modulation Standards

Digital Wireless Measurement Solution

Text Book: Simon Haykin & Michael Moher,

Acoustic Tremor Measurement: Comparing Two Systems

Epoch Extraction From Emotional Speech

ENSEMBLE String Synthesizer

Relative occurrences and difference of extrema for detection of transitions between broad phonetic classes

Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask

Resonance and resonators

Plaits. Macro-oscillator

Airflow visualization in a model of human glottis near the self-oscillating vocal folds model

Transcription:

Noise Sources Voiceless aspiration can be produced with a noise source at the glottis. (also for voiceless sonorants, including vowels) Noise source that is filtered through VT cascade, so some resonance information will be maintained in output. Breathy voice can be approximated by combining (adding) a noise source and the voiced source Not true breathy voice, which has a different vibrational cycle than modal voice. Different glottal filter would be required.

Nature of Noise source Gaussian noise noise = randn(1,4000) soundsc (noise, 10000); hist (noise,20) spectrum (noise,10000)

Combined Source Model Noise Voiced + noise * AH_interp * AH_gain pulse * AV_interp * AV_gain =

syn3 function signal = syn3 (srate,frame_dur,nf,ftable) % synthesize.m % Louis Goldstein % November 2009 % formant synthesizer % usage: % [out, t] = syn3 (srate,frame_dur,ftable) % % input arguments: % srate sampling rate (in Hz) % f0 fundamental frequency (in Hz) % frame_dur duration of each frame in milliseconds % Ftable character string containing filename of F table % Row 1: AV % Row 2: f0 % Row 3: AH % Row 4 to Row 4+nf-1: formant frequencies % Row 4+nf to Row 4+2*nf-1: formant bandwiths % %returned arguments: % signal vector with synthesized waveform samples % location of parameters in table iav = 1; if0 = 2; iah =3; if1 = 4; ib1 = if1+nf;

% location of parameters in table iav = 1; if0 = 2; iah =3; if1 = 4; ib1 = if1+nf; AV_gain = 100; AH_gain =.05; % voiced gain factor % voiceless gain factor FBW = get_fbw(ftable); nframes = size(fbw,2); dur = nframes * (frame_dur / 1000 );% duration in seconds samps_per_frame = floor(srate * (frame_dur / 1000));

% generate sources % voiced source f0 = FBW(iF0,:); AV = FBW(iAV,:)*AV_gain; voiced = make_impulse_av(f0, srate, frame_dur, AV); nframes = min ([floor(length(voiced)./ samps_per_frame) nframes]); RG = 0; % RG is the frequency of the Glottal Resonator BWG = 100; % BWG is the bandwidth of the Glottal Resonator [b_glo,a_glo]=resonance(srate,rg,bwg); % filter impulse train thru low-pass filter % to get approximation to shape of glottal pulse voiced=filter(b_glo, a_glo, voiced); % noise source AH = FBW(iAH,1:nframes)*AH_gain; noise = randn(1, length(voiced)); % Gaussian noise % take derivative (calculate velocity source from pressure source) noise = filter ([.5.5], 1, noise); AH_int = interp(ah, samps_per_frame); % compute composite source in = voiced + (noise.* AH_int);

function pulses = make_impulse(f0, srate, frame_dur,av); % Input parameters % f0 vector of f0 values % srate sampling rate (Hz) % frame_dur duration of each f0 frame (corresponds to slide in get_f0) % AV vector of voicing amplitudes frame_length = floor(frame_dur * srate / 1000); % frame length in samples length_f0 = length(f0); % interpolate f0 so it has a value for every sample and scale in cycles/ sample cont_freq = interp(f0/srate, frame_length); cont_av = interp(av,frame_length); % calculate elapsed cycles for every sample elapsed_cycles = cumsum(cont_freq); %calculate percentage way through current cycle cycle_percent = rem(elapsed_cycles,1); shift = [0 cycle_percent(1:end-1)]; % set pulses (1s) and 0s elsewhere pulses = cycle_percent<shift; % will be true only when cycle boundary is crossed pulses = cont_av.* double(pulses);

New FBW file: aba3.txt Arbitrary Char Arbitrary nos. Must be longest line F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 AV 1 100 20 100 22 10 33 10 35 100 60 100 F0 1 100 60 100 Pairs: frame val AH 1 0 60 0 F1 1 650 14 650 21 550 35 275 40 650 60 650 F2 1 1350 14 1350 21 1140 35 875 40 1350 60 1350 F3 1 2450 14 2450 21 2300 35 2500 40 2450 60 2450 F4 1 3100 60 3100 F5 1 4650 60 4650 B1 1 100 60 100 B2 1 150 60 150 B3 1 200 60 200 B4 1 200 60 200 B5 1 400 60 400 Make sure each line specifies value for frame 1, and for last frame

Stages in modeling VCV (1) Set vowel formants and bandwidths: use constant values for entire sequence (2) Find location of silence (or low amplitude closure voicing) and set low value of AV there. (3) Find values of F1-F3 at closure onset and release. Interpolate between vowel F values and these values over the frames where you see transition (4) For aspirated stops, add AH and remove AV during release transitions. Also increase B1 during this interval.

aba

aba3.txt F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 AV 1 100 20 100 22 10 33 10 35 100 60 100 F0 1 100 60 100 AH 1 0 60 0 F1 1 650 14 650 21 550 35 275 40 650 60 650 F2 1 1350 14 1350 21 1140 35 875 40 1350 60 1350 F3 1 2450 14 2450 21 2300 35 2500 40 2450 60 2450 F4 1 3100 60 3100 F5 1 4650 60 4650 B1 1 100 60 100 B2 1 150 60 150 B3 1 200 60 200 B4 1 200 60 200 B5 1 400 60 400

apa.txt F 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 AV 1 100 20 100 22 0 33 0 38 0 42 100 60 100 F0 1 100 60 100 AH 1 0 33 0 35 50 38 50 40 0 60 0 F1 1 650 14 650 21 550 35 275 40 650 60 650 F2 1 1350 14 1350 21 1140 35 875 40 1350 60 1350 F3 1 2450 14 2450 21 2300 35 2500 40 2450 60 2450 F4 1 3100 60 3100 F5 1 4650 60 4650 B1 1 100 60 100 B2 1 150 60 150 B3 1 200 60 200 B4 1 200 60 200 B5 1 400 60 400