A Physiologically Produced Impulsive UWB signal: Speech

Similar documents
Communications Theory and Engineering

COST IC0902: Brief Summary

SPEECH AND SPECTRAL ANALYSIS

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006

The source-filter model of speech production"

Ultra Wide Band Communications

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Linguistic Phonetics. The acoustics of vowels

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

Speech Synthesis using Mel-Cepstral Coefficient Feature

Linguistic Phonetics. Spectral Analysis

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

On the glottal flow derivative waveform and its properties

Digital Speech Processing and Coding

Lecture 1 - September Title 26, Ultra Wide Band Communications

Acoustic Phonetics. Chapter 8

Source-filter analysis of fricatives

Speech Synthesis; Pitch Detection and Vocoders


Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Power limits fulfilment and MUI reduction based on pulse shaping in UWB networks

Digital Signal Representation of Speech Signal

Parameterization of the glottal source with the phase plane plot

Glottal source model selection for stationary singing-voice by low-band envelope matching

COMP 546, Winter 2017 lecture 20 - sound 2

Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA

Block diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

A perceptually and physiologically motivated voice source model

Digital Signal Processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Speech Perception Speech Analysis Project. Record 3 tokens of each of the 15 vowels of American English in bvd or hvd context.

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

Overview of Code Excited Linear Predictive Coder

Review: Frequency Response Graph. Introduction to Speech and Science. Review: Vowels. Response Graph. Review: Acoustic tube models

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz

EE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley

Foundations of Language Science and Technology. Acoustic Phonetics 1: Resonances and formants

Source-Filter Theory 1

Converting Speaking Voice into Singing Voice

About waves. Sounds of English. Different types of waves. Ever done the wave?? Why do we care? Tuning forks and pendulums

Chapter 3. Description of the Cascade/Parallel Formant Synthesizer. 3.1 Overview

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

COMPARING ACOUSTIC GLOTTAL FEATURE EXTRACTION METHODS WITH SIMULTANEOUSLY RECORDED HIGH- SPEED VIDEO FEATURES FOR CLINICALLY OBTAINED DATA

Quarterly Progress and Status Report. On certain irregularities of voiced-speech waveforms

-voiced. +voiced. /z/ /s/ Last Lecture. Digital Speech Processing. Overview of Speech Processing. Example on Sound Source Feature

Perceptual evaluation of voice source models a)

ENEE408G Multimedia Signal Processing

Epoch Extraction From Emotional Speech

Recap the waveform. Complex waves (dạnh sóng phức tạp) and spectra. Recap the waveform

Subtractive Synthesis & Formant Synthesis

APPLICATIONS OF DSP OBJECTIVES

L19: Prosodic modification of speech

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding

Comparison of CELP speech coder with a wavelet method

Telecommunication Electronics

An Experimentally Measured Source Filter Model: Glottal Flow, Vocal Tract Gain and Output Sound from a Physical Model

Analysis/synthesis coding

A Review of Glottal Waveform Analysis

The Channel Vocoder (analyzer):

Source-filter Analysis of Consonants: Nasals and Laterals

Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech

WaveSurfer. Basic acoustics part 2 Spectrograms, resonance, vowels. Spectrogram. See Rogers chapter 7 8

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

SGN Audio and Speech Processing

Quarterly Progress and Status Report. A note on the vocal tract wall impedance

Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels

Pitch Period of Speech Signals Preface, Determination and Transformation

Exam 3--PHYS 151--Chapter 4--S14

Psychology of Language

Enhanced Waveform Interpolative Coding at 4 kbps

Synthesis Algorithms and Validation

SPEECH ANALYSIS* Prof. M. Halle G. W. Hughes A. R. Adolph

Chapter IV THEORY OF CELP CODING

Voiced/nonvoiced detection based on robustness of voiced epochs

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER

An Implementation of the Klatt Speech Synthesiser*

SPEECH ANALYSIS-SYNTHESIS FOR SPEAKER CHARACTERISTIC MODIFICATION

Quarterly Progress and Status Report. Notes on the Rothenberg mask

Automatic estimation of the lip radiation effect in glottal inverse filtering

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

2nd MAVEBA, September 13-15, 2001, Firenze, Italy

Speech Coding using Linear Prediction

Research in Ultra Wide Band(UWB) Wireless Communications

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

A New Iterative Algorithm for ARMA Modelling of Vowels and glottal Flow Estimation based on Blind System Identification

Ultra wideband and Bluetooth detection based on energy features

Slovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova

Low Bit Rate Speech Coding

Chapter 3 Data Transmission COSC 3213 Summer 2003

A() I I X=t,~ X=XI, X=O

Data and Computer Communications Chapter 3 Data Transmission

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

COMP211 Physical Layer

EE482: Digital Signal Processing Applications

DS-UWB signal generator for RAKE receiver with optimize selection of pulse width

Transcription:

A Physiologically Produced Impulsive UWB signal: Speech Maria-Gabriella Di Benedetto University of Rome La Sapienza Faculty of Engineering Rome, Italy gaby@acts.ing.uniroma1.it http://acts.ing.uniroma1.it

Observation: many physiologically produced signals are impulsive in nature Their waveforms have Impulse Radio wave shapes They are UWB since their centre frequency is the zero frequency a coincidence?

Neuronal pulses Much of neural computation involves processing these neuronal spike trains Spikes, Exploring the Neural Code (Computational Neuroscience) QRS-complex pulses Speech pulses

Speech waveform Presence of periodic vs. noise-like portions Periodic portions correspond to voiced sounds: during production, vocal folds vibrate Noise-like portions correspond to voiceless sounds: during production vocal folds do not vibrate

Speech production mechanism

Speech production model for voiced sounds Derivative of volume velocity U(t) Linear System Effect of vocal tract Speech sound pressure p(t)p(t)

Speech production model for voiceless sounds Derivative of volume velocity U(t) Linear System Constriction Turbulence Speech sound pressure p(t)p(t)

Spectrum of a voiced sound QuickTimeᆰ and a TIFF (Uncompressed) decompressor are needed to see this picture. By courtesy of Hari Arsikere UCLA Speech Processing and Auditory Perception Laboratory UCLA, USA, Prof. Abeer Alwan Director

Spectrum of a voiceless sound QuickTimeᆰ and a TIFF (Uncompressed) decompressor are needed to see this picture. By courtesy of Hari Arsikere UCLA Speech Processing and Auditory Perception Laboratory UCLA, USA, Prof. Abeer Alwan Director

The model in the VOice CODER VOCODER Pitch period F0 Voiced/voiceless switch Gain Noise Source Vocal tract Based on analog vocoder, Homer W. Dudley, patent 1939

VOCODER strongest limitation The model is way too simplistic in the case of sounds with a mixed voiced-voiceless nature

Mixed-Excited VOCODER Gp x x Gn This model is based on linear combination of periodic and noise excitation

CELP VOCODER Used in GSM, UMTS and many others x multi-pulse x The best multi-pulse is selected from a set stored in a codebook But why best is best still remains to be understood Based on multi-pulse model presented by Atal and Remde, ICASSP, 1982

Spectrum of a mixed sound QuickTimeᆰ and a TIFF (Uncompressed) decompressor are needed to see this picture. Periodicity loss at low frequencies Aspirated sound [hiy] Tilt at high frequencies

Vocal folds Lateral sections of vibrating vocal folds Two-mass model of vocal folds From Stevens, Acoustic Phonetics, The MIT Press, 2000

The LF model of the glottal source Derivative of the glottal airflow Looks like the transmitter antenna output: first derivative of a bell-shape pulse Introduced by G.Fant et al. in 1985, refined by G. Fant, "The LF-model revisited.transformations andfrequency domain analysis", in "STL-QPSR Journal", vol. 36, 119-156, 1995

Excitation signal at the glottis c ti s i al e id

Excitation signal at the glottis c ti s i al e r

Impulse Radio UWB Pulse Position Modulation m(kts) Ts Samples m(kts) of an analog wave m(t) determine pulse position From M.-G. Di Benedetto and G. Giancola, Understanding Ultra Wide Band Radio Fundamentals, Prentice Hall, 2004

Impulse Radio UWB Pulse Position Modulation 2 2 ᆬ Π( φ) Ω ( φ) 2 ᆬ1 Ω ( φ) + Px ( f ) = PPM Τσ ᆰ Τσ ᆬ ᆬ ν ᆬ ᆬ δ ( φ Τ )ᆰ ν = ᆬ σ ᆬ +ᆰ where W(f) is the Fourier transform of the probability density w and coincides with the characteristic function of w computed in -2πf +ᆰ W( f ) = ϕ2π φσ ϕ2π φσ ω ( σ ) ε δφ = ε = Χ ( 2π φ) ᆬ ᆬ w(s) is the probability density function of samples m(kts) of a stationary continuous process m(t) From M.-G. Di Benedetto and G. Giancola, Understanding Ultra Wide Band Radio Fundamentals, Prentice Hall, 2004

Impulse Radio UWB Pulse Position Modulation

Experimental evidence Synthesis of a vowel produced by one male and one female speaker regular pulses H(z) Synthetic vowel Increasing % of pulse jitter irregular pulses H(z) Synthetic vowel

Experimental results Synthesis of vowel [e] male speaker Synthetic vowel no jitter Synthetic vowel 5% jitter Synthetic vowel 10% jitter Synthetic vowel 30% jitter

Experimental results Synthesis of vowel [a] female speaker Synthetic vowel no jitter Synthetic vowel 5% jitter Synthetic vowel 10% jitter Synthetic vowel 30% jitter

Conclusion Example of how UWB theory can help us understanding the structure of impulsive physiologically produced signals Interesting insights can be derived from what we know about properties of non-linear modulation in UWB Modeling production mechanisms in order to understand basic properties of physiologically produced signals

Challenging workframe COST Action IC0902 Cognitive Radio and Networking for Cooperative Coexistence of Heterogeneous Wireless Networks Chair: Maria-Gabriella Di Benedetto http://newyork.ing.uniroma1.it/ic0902

Economic dimension QuickTime ᆰ e un decompressore sono necessari per visualizzare quest'immagine. QuickTimeᆰ e un sono decompressore necessari per visualizzare quest'immagine. QuickTimeᆰ e un sono decompressore necessari per visualizzare quest'immagine. QuickTimeᆰ e un sono decompressore necessari per visualizzare quest'immagine. QuickTime ᆰ e un decompressore sono necessari per visualizzare quest'immagine. Cyprus QuickTime ᆰ e un decompressore sono necessari per visualizzare quest'immagine. Czech Rep. Rep. Ireland Israel Latvia Norway Romania Slovenia Sweden Turkey Riunione GTTI 2010, 23 giugno 2010, Brescia Estimated economic dimension: 44 Million ᆬ for the total duration of the Action 10 20 COST COST countries countries Participation of over 30 3 countries 5 non-cost QuickTimeᆬ e un decompressore sono necessari per visualizzare quest'immagine. countries

Challenging workframe COST Action IC0902 Cognitive Radio and Networking for Cooperative Coexistence of Heterogeneous Wireless Networks Chair: Maria-Gabriella Di Benedetto http://newyork.ing.uniroma1.it/ic0902 EU FP7 Network of Excellence ACROPOLIS Advanced coexistence technologies ofr Radio OPtimisatiOn in Licensed and unlicensed Spectrum October 1, 2010