Slovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova

Size: px
Start display at page:

Download "Slovak University of Technology and Planned Research in Voice De-Identification. Anna Pribilova"

Transcription

1 Slovak University of Technology and Planned Research in Voice De-Identification Anna Pribilova

2 SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA the oldest and the largest university of technology in Slovakia founded: 1938

3 SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA Academic year 2012/ study programs students graduates 1 st DEGREE 3 years Bachelor study 52 study programs students graduates 2 nd DEGREE 2 years Master study 65 study programs students graduates 3 rd DEGREE 3 years Doctoral study 69 study programs students 299 graduates 3

4 SLOVAK UNIVERSITY OF TECHNOLOGY IN BRATISLAVA Faculty of Civil Engineering Faculty of Mechanical Engineering Faculty of Electrical Engineering and Information Technology Faculty of Chemical and Food Technology Faculty of Architecture Faculty of Material Sciences and Technology Faculty of Informatics and Information Technologies

5 FACULTY OF ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY 1940 Department of Mechanical and Electrical Engineering 1941 Department of Electrical Engineering Faculty of Electrical Engineering 1994 Faculty of Electrical Engineering & Information Technology

6 FACULTY OF ELECTRICAL ENGINEERING AND INFORMATION TECHNOLOGY Institute of Automobile Mechatronics Institute of Computer Science and Mathematics Institute of Control And Industrial Informatics Institute of Electrical Engineering Institute of Electronics and Photonics Institute of Nuclear and Physical Engineering Institute of Power and Applied Electrical Engineering Institute of Telecommunications

7 INSTITUTE OF ELECTRONICS AND PHOTONICS Department of Electronic Devices Department of Electronic Systems Department of Integrated Circuit Design and Test Department of Optoelectronics and Laser technique Department of Sensors And Microsystem Technique Department of Surface, Interface, and Nanostructure Analysis Department of Wireless Communication and Multimedia

8 Department of Wireless Communication and Multimedia MRI of the vocal tract during phonation cooperation with: Institute of Measurement Science, Slovak Academy of Sciences - design, realization, and testing of a new head scanning coil for MRI Esaote E-scan Opera for 3-D modelling of the human vocal tract Simulation of a field of a coil z Practical realization Bz r Results: MRI of the vocal tract

9 Department of Wireless Communication and Multimedia MRI of the vocal tract during phonation cooperation with: Institute of Thermomechanics Academy of Sciences of the Czech Republic involved in COST Action 2103 Advanced Voice Function Assessment 3-D model of the human vocal tract

10 Score [-] Score [-] Department of Wireless Communication and Multimedia Evaluation of Synthetic Speech Quality by the GMM Classifier cooperation with: Department of Cybernetics, Faculty of Applied Sciences, Male, N gmix =1 University of West Bohemia, Plzeň, Czech 5 Republic Type Model Specification 4 1 (h48) harmonic Spectral envelope smoothed by B-Splines 2 (i80) cepstral Impulse response of the real cepstrum 1 3 (k64) cepstral Minimum phase of real cepstrum, mixed excitation 4 (s64) cepstral Real cepstrum, excitation by Hilbert impulse 5 (o50) cepstral Optimized structure of cascade approximation filter 3 2 h48 i80 k64 s64 o50 Resynthesized speech with different types of synthesis and parameterization Speech features analysis Features vector F(1) F(8) F(16) Trained on the original speech GMM classifier T, i male/female GMM models N-level Score discriminator N output classes N N-1 N Male r, N gmix =1 h48 i80 k64 s64 o50

11 Occurence [%] C1 Shimmer Correctness [%] Occurence [%] SFM SHE Occurence [%] R0ener HNR Department of Wireless Communication and Multimedia Detection and Localization of Artifacts in Synthetic Speech based on ANOVA Statistics cooperation with: Department of Cybernetics, Faculty of Applied Sciences, University of West Bohemia, Plzeň, Czech Republic Synt Synt Synt1 Synt1 HNR Synt Synt Synt1 Synt1 SHE Synt Histograms Synt Statistics Synt1 Synt1 Artf Artf v1o v2o v3o v4o v5o v6o v7o v8o v9o v10o Artifact Classification per Speech Features Speech feature set: 2 => "" - female M R Cener R0ener RE SHE F0DIFF ZCR F0K Jitter Shimmer Accuracy of Artifact Detection per Speech Features Speech feature set: 2 => "Artf" - female Speech feature set: 2 M R - female M R Artf v1s v2s v3s v4s v5s v6s v7s v8s v9s v10s Synt1 Synt1 Shimmer 0 Synt1 Cener 0 R0ener RE SHE F0DIFF ZCR F0K Jitter Shimmer Cener R0ener RE SHE F0DIFF ZCR F0K Jitter Shimmer All

12 De-identification for privacy protection in multimedia content our area of interest: de-identification of voice and face de-identification of personal health information following on cooperation with the Czech group: COST 258 The Naturalness of Synthetic Speech ( ) COST 277 Non-Linear Speech Processing ( ) our participation: COST 2102 Cross-Modal Analysis of Verbal and Non-verbal Communication ( )

13 De-identification of voice as a behavioural and soft-biometric identifier (speaker de-identification by voice conversion) investigation of reversibility of voice conversion - inverse voice transformation of converted speech (using parameters reciprocal to voice conversion) - utilization of our former work (male voice conversion to female, young male, child) evaluation of identification of converted voice - speaker identification based on GMMs

14 De-identification of voice as a behavioural and soft-biometric identifier (age and gender de-identification by voice conversion) voice conversion with preserved naturalness and intelligibility - change of apparent age of a speaker - change of apparent gender of a speaker evaluation of degree of similarity between converted and original voice - choice of segmental and suprasegmental features - choice of a classifier

15 De-identification of voice as a behavioural and soft-biometric identifier (speaker de-identification by emotional voice conversion) emotional voice conversion with preserved naturalness and intellibility - change of apparent emotional state of a speaker (our involvement in the former COST Action 2102) - possible change of apparent speaker identity evaluation of speaker identification after emotional conversion of his/her voice - choice of segmental and suprasegmental features for GMM speaker identification

16 De-identification of voice as a behavioural and soft-biometric identifier (speaker de-identification by combination of gender conversion and emotional voice conversion) male to female voice transformation (and vice versa) - spectrum envelope modification - F0 modification at the same time: emotional voice conversion - spectral modification - prosodic modification

17 Current work: Emotion conversion and speaker re-identification Neutral-to-emotional voice conversion Input sentence (in Neutral style) Spectral properties modification {F n } SFM Male / female voice Target sentences Joyous style Speech analysis and parameterization Joy Surprise Sadness Speech resynthesis Surprised style Sad style Modification of prosodic parameters F0 En DUR Anger Cepstral / harmonic speech model Angry style

18 Emotion conversion and speaker re-identification Determination of feature vectors of emotional speech Input sentence Segmentation F0 determination Smoothed spect. envelope Power spectral density {PSD db } F0 E(f) Speech signal analysis Suprasegmental parameters Basic spectral properties Supplement. spectral properties {F0 DIFF } {Jitter} {Shimmer}, etc. {F 1,2,3 } {F 12 ratio} {Tilt}, etc. {HNR} {SFM} {SC}, etc. Statistical processing Mean Median Min Std Rel. min Rel. max Skewness Kurtosis Output feature vector F ( 1) F ( 8) F ( N F ) N feat

19 Emotion conversion and speaker re-identification GMM emotion classification of emotionally converted speech trained on real emotional speech Tested sentence with transformed emotional speech styles Resynthetized sentence in neutral speech style for comparison only Speech features analysis Features vector F(1) F(8) F(16) GMM emotion classifier GMM models for male/female Trained data of the original speech in neutral and emotional styles Output emotion classes Neutral Joy Surprise Sadness Anger

20 Emotion conversion and speaker re-identification GMM speaker identification of emotionally converted speech trained on real neutral speech Tested sentence with transformed emotional speech styles Features vector F( 1) N gmix N-level N output speaker classes Speaker N Speech features analysis F ( 8) GMM speaker recognizer T, i Score discriminator Speaker N-1 F ( N F ) Trained data of the original speech in neutral style Male / female GMM speaker models Speaker 2 Speaker 1

21 Thank you for your attention.

Introducing COVAREP: A collaborative voice analysis repository for speech technologies

Introducing COVAREP: A collaborative voice analysis repository for speech technologies Introducing COVAREP: A collaborative voice analysis repository for speech technologies John Kane Wednesday November 27th, 2013 SIGMEDIA-group TCD COVAREP - Open-source speech processing repository 1 Introduction

More information

Influence of the human body mass in the open air MRI on acoustic noise spectrum

Influence of the human body mass in the open air MRI on acoustic noise spectrum ACTA IMEKO ISSN: 1 87X November 16, Volume, Number 3, 81 86 Influence of the human body mass in the open air MRI on acoustic noise spectrum Jiří Přibil 1, Anna Přibilová, Ivan Frollo 1 1 Institute of Measurement

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

SOUND SOURCE RECOGNITION AND MODELING

SOUND SOURCE RECOGNITION AND MODELING SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental

More information

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied

More information

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz Between physics and perception signal models for high level audio processing Axel Röbel Analysis / synthesis team, IRCAM DAFx 2010 iem Graz Overview Introduction High level control of signal transformation

More information

Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants

Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants Zhi Zhu, Ryota Miyauchi, Yukiko Araki, and Masashi Unoki School of Information Science, Japan Advanced

More information

Envelope Modulation Spectrum (EMS)

Envelope Modulation Spectrum (EMS) Envelope Modulation Spectrum (EMS) The Envelope Modulation Spectrum (EMS) is a representation of the slow amplitude modulations in a signal and the distribution of energy in the amplitude fluctuations

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

Epoch Extraction From Emotional Speech

Epoch Extraction From Emotional Speech Epoch Extraction From al Speech D Govind and S R M Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati Email:{dgovind,prasanna}@iitg.ernet.in Abstract

More information

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Advanced audio analysis. Martin Gasser

Advanced audio analysis. Martin Gasser Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high

More information

USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM

USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM USING A WHITE NOISE SOURCE TO CHARACTERIZE A GLOTTAL SOURCE WAVEFORM FOR IMPLEMENTATION IN A SPEECH SYNTHESIS SYSTEM by Brandon R. Graham A report submitted in partial fulfillment of the requirements for

More information

Recording and post-processing speech signals from magnetic resonance imaging experiments

Recording and post-processing speech signals from magnetic resonance imaging experiments Recording and post-processing speech signals from magnetic resonance imaging experiments Theoretical and practical approach Juha Kuortti and Jarmo Malinen November 28, 2017 Aalto University juha.kuortti@aalto.fi,

More information

Speech/Music Discrimination via Energy Density Analysis

Speech/Music Discrimination via Energy Density Analysis Speech/Music Discrimination via Energy Density Analysis Stanis law Kacprzak and Mariusz Zió lko Department of Electronics, AGH University of Science and Technology al. Mickiewicza 30, Kraków, Poland {skacprza,

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Glottal source model selection for stationary singing-voice by low-band envelope matching

Glottal source model selection for stationary singing-voice by low-band envelope matching Glottal source model selection for stationary singing-voice by low-band envelope matching Fernando Villavicencio Yamaha Corporation, Corporate Research & Development Center, 3 Matsunokijima, Iwata, Shizuoka,

More information

Hungarian Speech Synthesis Using a Phase Exact HNM Approach

Hungarian Speech Synthesis Using a Phase Exact HNM Approach Hungarian Speech Synthesis Using a Phase Exact HNM Approach Kornél Kovács 1, András Kocsor 2, and László Tóth 3 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University

More information

Analysis of Influence of Coil Gradient System on Vibration Properties and Acoustic Noise Level Generated by the Low Field MRI Device

Analysis of Influence of Coil Gradient System on Vibration Properties and Acoustic Noise Level Generated by the Low Field MRI Device Analysis of Influence of Coil Gradient System on Vibration Properties and Acoustic Noise Level Generated by the Low Field MRI Device Jiří Přibil 1, Anna Přibilová 2, Ivan Frollo 1 1 Institute of Measurement

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

AN AUDIO SEPARATION SYSTEM BASED ON THE NEURAL ICA METHOD

AN AUDIO SEPARATION SYSTEM BASED ON THE NEURAL ICA METHOD AN AUDIO SEPARATION SYSTEM BASED ON THE NEURAL ICA METHOD MICHAL BRÁT, MIROSLAV ŠNOREK Czech Technical University in Prague Faculty of Electrical Engineering Department of Computer Science and Engineering

More information

Resonance and resonators

Resonance and resonators Resonance and resonators Dr. Christian DiCanio cdicanio@buffalo.edu University at Buffalo 10/13/15 DiCanio (UB) Resonance 10/13/15 1 / 27 Harmonics Harmonics and Resonance An example... Suppose you are

More information

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You

More information

Voice Conversion of Non-aligned Data using Unit Selection

Voice Conversion of Non-aligned Data using Unit Selection June 19 21, 2006 Barcelona, Spain TC-STAR Workshop on Speech-to-Speech Translation Voice Conversion of Non-aligned Data using Unit Selection Helenca Duxans, Daniel Erro, Javier Pérez, Ferran Diego, Antonio

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Fundamental Frequency Detection

Fundamental Frequency Detection Fundamental Frequency Detection Jan Černocký, Valentina Hubeika {cernocky ihubeika}@fit.vutbr.cz DCGM FIT BUT Brno Fundamental Frequency Detection Jan Černocký, Valentina Hubeika, DCGM FIT BUT Brno 1/37

More information

DERIVATION OF TRAPS IN AUDITORY DOMAIN

DERIVATION OF TRAPS IN AUDITORY DOMAIN DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.

More information

Can binary masks improve intelligibility?

Can binary masks improve intelligibility? Can binary masks improve intelligibility? Mike Brookes (Imperial College London) & Mark Huckvale (University College London) Apparently so... 2 How does it work? 3 Time-frequency grid of local SNR + +

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA

ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION DARYUSH MEHTA ASPIRATION NOISE DURING PHONATION: SYNTHESIS, ANALYSIS, AND PITCH-SCALE MODIFICATION by DARYUSH MEHTA B.S., Electrical Engineering (23) University of Florida SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

Text Emotion Detection using Neural Network

Text Emotion Detection using Neural Network International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 7, Number 2 (2014), pp. 153-159 International Research Publication House http://www.irphouse.com Text Emotion Detection

More information

A multi-class method for detecting audio events in news broadcasts

A multi-class method for detecting audio events in news broadcasts A multi-class method for detecting audio events in news broadcasts Sergios Petridis, Theodoros Giannakopoulos, and Stavros Perantonis Computational Intelligence Laboratory, Institute of Informatics and

More information

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1

More information

CHAPTER 3. ACOUSTIC MEASURES OF GLOTTAL CHARACTERISTICS 39 and from periodic glottal sources (Shadle, 1985; Stevens, 1993). The ratio of the amplitude of the harmonics at 3 khz to the noise amplitude in

More information

Voice Activity Detection

Voice Activity Detection Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class

More information

Is it possible to shape a university's research portfolio and to develop synergies between top down and bottom up approaches?

Is it possible to shape a university's research portfolio and to develop synergies between top down and bottom up approaches? Is it possible to shape a university's research portfolio and to develop synergies between top down and bottom up approaches? Peter Skalicky, Rector Vienna University of Technology Conference of Rectors

More information

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006

INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 1. Resonators and Filters INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 Different vibrating objects are tuned to specific frequencies; these frequencies at which a particular

More information

Design and Analysis of New Digital Modulation classification method

Design and Analysis of New Digital Modulation classification method Design and Analysis of New Digital Modulation classification method ANNA KUBANKOVA Department of Telecommunications Brno University of Technology Purkynova 118, 612 00 Brno CZECH REPUBLIC shklya@feec.vutbr.cz

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER

X. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER X. SPEECH ANALYSIS Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER Most vowel identifiers constructed in the past were designed on the principle of "pattern matching";

More information

PR No. 119 DIGITAL SIGNAL PROCESSING XVIII. Academic Research Staff. Prof. Alan V. Oppenheim Prof. James H. McClellan.

PR No. 119 DIGITAL SIGNAL PROCESSING XVIII. Academic Research Staff. Prof. Alan V. Oppenheim Prof. James H. McClellan. XVIII. DIGITAL SIGNAL PROCESSING Academic Research Staff Prof. Alan V. Oppenheim Prof. James H. McClellan Graduate Students Bir Bhanu Gary E. Kopec Thomas F. Quatieri, Jr. Patrick W. Bosshart Jae S. Lim

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT RASHMI MAKHIJANI Department of CSE, G. H. R.C.E., Near CRPF Campus,Hingna Road, Nagpur, Maharashtra, India rashmi.makhijani2002@gmail.com

More information

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image

A Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

A Physiologically Produced Impulsive UWB signal: Speech

A Physiologically Produced Impulsive UWB signal: Speech A Physiologically Produced Impulsive UWB signal: Speech Maria-Gabriella Di Benedetto University of Rome La Sapienza Faculty of Engineering Rome, Italy gaby@acts.ing.uniroma1.it http://acts.ing.uniroma1.it

More information

Associated Emotion and its Expression in an Entertainment Robot QRIO

Associated Emotion and its Expression in an Entertainment Robot QRIO Associated Emotion and its Expression in an Entertainment Robot QRIO Fumihide Tanaka 1. Kuniaki Noda 1. Tsutomu Sawada 2. Masahiro Fujita 1.2. 1. Life Dynamics Laboratory Preparatory Office, Sony Corporation,

More information

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Dimension Reduction of the Modulation Spectrogram for Speaker Verification Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and

More information

SPEECH AND SPECTRAL ANALYSIS

SPEECH AND SPECTRAL ANALYSIS SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs

More information

Separating Voiced Segments from Music File using MFCC, ZCR and GMM

Separating Voiced Segments from Music File using MFCC, ZCR and GMM Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.

More information

Psychology of Language

Psychology of Language PSYCH 150 / LIN 155 UCI COGNITIVE SCIENCES syn lab Psychology of Language Prof. Jon Sprouse 01.10.13: The Mental Representation of Speech Sounds 1 A logical organization For clarity s sake, we ll organize

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #27 Tuesday, November 11, 23 6. SPECTRAL ANALYSIS AND ESTIMATION 6.1 Introduction to Spectral Analysis and Estimation The discrete-time Fourier

More information

THE TELECOMMUNICATIONS industry is going

THE TELECOMMUNICATIONS industry is going IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 6, NOVEMBER 2006 1935 Single-Ended Speech Quality Measurement Using Machine Learning Methods Tiago H. Falk, Student Member, IEEE,

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS 1 WAHYU KUSUMA R., 2 PRINCE BRAVE GUHYAPATI V 1 Computer Laboratory Staff., Department of Information Systems, Gunadarma University,

More information

Subtractive Synthesis & Formant Synthesis

Subtractive Synthesis & Formant Synthesis Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/

More information

SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction

SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction SPEech Feature Toolbox (SPEFT) Design and Emotional Speech Feature Extraction by Xi Li A thesis submitted to the Faculty of Graduate School, Marquette University, in Partial Fulfillment of the Requirements

More information

Visvesvaraya Technological University, Belagavi

Visvesvaraya Technological University, Belagavi Time Table for M.TECH. Examinations, June / July 2017 M. TECH. 2010 Scheme 2011 Scheme 2012 Scheme 2014 Scheme 2016 Scheme [CBCS] Semester I II III I II III I II III I II IV I II Time Date, Day 14/06/2017,

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

Impact Impact Factor Impact Factor Impact Factor Bil Journal Title eissn NO. Factor Applied Physics Letter

Impact Impact Factor Impact Factor Impact Factor Bil Journal Title eissn NO. Factor Applied Physics Letter Langganan Pangkalan Data xplore Kegunaan: FKEE, FSKKP dan lainlain Fakulti serta Jabatan Berkaitan Subjek: Elektrik, Electronik, Computing Akses: 1913 terkini Impact Impact Factor Impact Factor Impact

More information

A Two-step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis

A Two-step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis A Two-step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis Colin Vaz, Vikram Ramanarayanan, and Shrikanth Narayanan USC SAIL Lab INTERSPEECH Articulatory Data

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech

Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 9, NO. 7, OCTOBER 2001 713 Pitch-Scaled Estimation of Simultaneous Voiced and Turbulence-Noise Components in Speech Philip J. B. Jackson, Member,

More information

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech

More information

A Comparative Performance of Various Speech Analysis-Synthesis Techniques

A Comparative Performance of Various Speech Analysis-Synthesis Techniques International Journal of Signal Processing Systems Vol. 2, No. 1 June 2014 A Comparative Performance of Various Speech Analysis-Synthesis Techniques Ankita N. Chadha, Jagannath H. Nirmal, and Pramod Kachare

More information

Sound Synthesis Methods

Sound Synthesis Methods Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like

More information

Bandwidth Extension for Speech Enhancement

Bandwidth Extension for Speech Enhancement Bandwidth Extension for Speech Enhancement F. Mustiere, M. Bouchard, M. Bolic University of Ottawa Tuesday, May 4 th 2010 CCECE 2010: Signal and Multimedia Processing 1 2 3 4 Current Topic 1 2 3 4 Context

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

REpeating Pattern Extraction Technique (REPET)

REpeating Pattern Extraction Technique (REPET) REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure

More information

Learning Human Context through Unobtrusive Methods

Learning Human Context through Unobtrusive Methods Learning Human Context through Unobtrusive Methods WINLAB, Rutgers University We care about our contexts Glasses Meeting Vigo: your first energy meter Watch Necklace Wristband Fitbit: Get Fit, Sleep Better,

More information

Book Chapters. Refereed Journal Publications J11

Book Chapters. Refereed Journal Publications J11 Book Chapters B2 B1 A. Mouchtaris and P. Tsakalides, Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications, in New Directions in Intelligent Interactive Multimedia,

More information

Correspondence. Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm

Correspondence. Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 3, MAY 1999 333 Correspondence Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm Sassan Ahmadi and Andreas

More information

Phoneme Recognizer Based Verification of the Voice Segment Type Determination

Phoneme Recognizer Based Verification of the Voice Segment Type Determination honeme Recognizer Based Verification of the Voice Segment Type Determination OLDŘICH HORÁK Faculty of Economics and Administration, Institute of System Engineering and Informatics University of ardubice

More information

Electronics and TELECOMMUNICATIONS- AUTOMATION & CONTROL SYSTEMS GENERAL

Electronics and TELECOMMUNICATIONS- AUTOMATION & CONTROL SYSTEMS GENERAL Electronics and TELECOMMUNICATIONS- AUTOMATION & CONTROL SYSTEMS Journals List " " GENERAL Title ISSN Impact Factor ISSU IEEE T PATTERN ANAL 0162-8828 3.579 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS

More information

Parameterization of the glottal source with the phase plane plot

Parameterization of the glottal source with the phase plane plot INTERSPEECH 2014 Parameterization of the glottal source with the phase plane plot Manu Airaksinen, Paavo Alku Department of Signal Processing and Acoustics, Aalto University, Finland manu.airaksinen@aalto.fi,

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL

VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL Narsimh Kamath Vishweshwara Rao Preeti Rao NIT Karnataka EE Dept, IIT-Bombay EE Dept, IIT-Bombay narsimh@gmail.com vishu@ee.iitb.ac.in

More information

Section 1.5 Graphs and Describing Distributions

Section 1.5 Graphs and Describing Distributions Section 1.5 Graphs and Describing Distributions Data can be displayed using graphs. Some of the most common graphs used in statistics are: Bar graph Pie Chart Dot plot Histogram Stem and leaf plot Box

More information

MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM

MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM www.advancejournals.org Open Access Scientific Publisher MFCC AND GMM BASED TAMIL LANGUAGE SPEAKER IDENTIFICATION SYSTEM ABSTRACT- P. Santhiya 1, T. Jayasankar 1 1 AUT (BIT campus), Tiruchirappalli, India

More information

Robust Algorithms For Speech Reconstruction On Mobile Devices

Robust Algorithms For Speech Reconstruction On Mobile Devices Robust Algorithms For Speech Reconstruction On Mobile Devices XU SHAO A Thesis presented for the degree of Doctor of Philosophy Speech Group School of Computing Sciences University of East Anglia England

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

IEEE Systems, Man, and Cybernetics Society s Perspectives and Brain-Related Technical Activities

IEEE Systems, Man, and Cybernetics Society s Perspectives and Brain-Related Technical Activities IEEE, Man, and Cybernetics Society s Perspectives and Brain-Related Technical Activities Michael H. Smith IEEE Brain Initiative New York City Three Broad Categories that Span IEEE Development of: novel

More information

The Effects of Noise on Acoustic Parameters

The Effects of Noise on Acoustic Parameters The Effects of Noise on Acoustic Parameters * 1 Turgut Özseven and 2 Muharrem Düğenci 1 Turhal Vocational School, Gaziosmanpaşa University, Turkey * 2 Faculty of Engineering, Department of Industrial Engineering

More information

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri

More information

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate

Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Digital Speech Processing- Lecture 14A Algorithms for Speech Processing Speech Processing Algorithms Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Single speech

More information

Biometric: EEG brainwaves

Biometric: EEG brainwaves Biometric: EEG brainwaves Jeovane Honório Alves 1 1 Department of Computer Science Federal University of Parana Curitiba December 5, 2016 Jeovane Honório Alves (UFPR) Biometric: EEG brainwaves Curitiba

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Analysis and Synthesis of Pathological Vowels

Analysis and Synthesis of Pathological Vowels Analysis and Synthesis of Pathological Vowels Prospectus Brian C. Gabelman 6/13/23 1 OVERVIEW OF PRESENTATION I. Background II. Analysis of pathological voices III. Synthesis of pathological voices IV.

More information

The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach

The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach The Partly Preserved Natural Phases in the Concatenative Speech Synthesis Based on the Harmonic/Noise Approach ZBYNĚ K TYCHTL Department of Cybernetics University of West Bohemia Univerzitní 8, 306 14

More information

Converting Speaking Voice into Singing Voice

Converting Speaking Voice into Singing Voice Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech

More information

LASER DIODE MODULATION AND NOISE

LASER DIODE MODULATION AND NOISE > 5' O ft I o Vi LASER DIODE MODULATION AND NOISE K. Petermann lnstitutfiir Hochfrequenztechnik, Technische Universitdt Berlin Kluwer Academic Publishers i Dordrecht / Boston / London KTK Scientific Publishers

More information

AN ANALYSIS OF ITERATIVE ALGORITHM FOR ESTIMATION OF HARMONICS-TO-NOISE RATIO IN SPEECH

AN ANALYSIS OF ITERATIVE ALGORITHM FOR ESTIMATION OF HARMONICS-TO-NOISE RATIO IN SPEECH AN ANALYSIS OF ITERATIVE ALGORITHM FOR ESTIMATION OF HARMONICS-TO-NOISE RATIO IN SPEECH A. Stráník, R. Čmejla Department of Circuit Theory, Faculty of Electrical Engineering, CTU in Prague Abstract Acoustic

More information

STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds

STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds INVITED REVIEW STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds Hideki Kawahara Faculty of Systems Engineering, Wakayama University, 930 Sakaedani,

More information