A Full-Band Adaptive Harmonic Representation of Speech
|
|
- Adele Sharyl Barker
- 6 years ago
- Views:
Transcription
1 A Full-Band Adaptive Harmonic Representation of Speech Gilles Degottex and Yannis Stylianou University of Crete - FORTH - Swiss National Science Foundation G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
2 The Sinusoidal and Harmonic Models Amplitude [db] DFT Harmonics Can fit any monophonic signal, we use it for speech The sinusoids can be harmonic, quasi-harmonic, or adaptive... G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
3 Time-Frequency Representations DFT s(t) = K k=0 a k e jφ k (t) φ k (t) = k (2π/K) t Constant frequency basis Time [s] FChT 1 s(t) = K k=0 a k e jφ k (t) φ k (t) = k (2π/K + α t) t Linear frequency basis Time [s] 1 M. Kepesi and L. Weruaga, Adaptive Chirp-based time-frequency analysis of speech signals, Speech communication, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
4 The Adaptive Quasi-Harmonic + Noise Model (aqhnm) 1 We can adapt the frequency basis to follow the frequency tracks Adaptive Quasi-Harmonic Model (aqhm) 1 φ k (t) = 2π f s t 0 f k(τ)dτ For speech representation in the high frequencies Amplitude modulated noise (aqhnm) 2 1 Y. Pantazis, O. Rosec and Y. Stylianou, Adaptive AM-FM Signal Decomposition With Application to Speech Analysis, IEEE Trans. on Audio, Speech, and Language Processing, Y. Pantazis, G. Tzedakis, O. Rosec, Y. Stylianou, Analysis/Synthesis of Speech based on an Adaptive Quasi-Harmonic plus Noise Model, ICASSP, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
5 The new ideas 1) From FChT, harmonics exist in high frequencies Use a full-band representation G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
6 The new ideas 1) From FChT, harmonics exist in high frequencies Use a full-band representation 2) Quasi-harmonicity can be useful for analysis but maybe not necessary for encoding/decoding Use the strict harmonicity and keep the adaptivity aqhnm ahm G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
7 The Adaptive Harmonic Model (ahm) ahm s(t) = K k= K φ k (t) = k 2π f s a k (t) e jφ k (t) t 0 f 0(τ)dτ a k (t) Amplitude and phase (complex-valued function) Interpolated from a t i k at time t i f 0 (t) Fundamental frequency Interpolated from f t i 0 at time t i Parameters at a time t i : {f t i 0, at i k } k {0,..., K i} G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
8 The problem of estimation for full-band representation A small f 0 error propagates by multiplication: f k = k f 0 Amplitude [db] Question How to estimate harmonics up to Nyquist? G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
9 The Adaptive Iterative Refinement (AIR) Assume first the f 0 error is small for low harmonics Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
10 The Adaptive Iterative Refinement (AIR) Then the frequency correction mechanism of QHM 1 can be used Amplitude [db] Y. Pantazis, O. Rosec and Y. Stylianou, Iterative Estimation of Sinusoidal Signal Parameters, IEEE Signal Processing Letters, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
11 The Adaptive Iterative Refinement (AIR) We can therefore increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
12 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
13 The Adaptive Iterative Refinement (AIR) Increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
14 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
15 The Adaptive Iterative Refinement (AIR) Increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
16 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
17 Evaluation: Listening test Impairment Total Male voices Female voices Original ahm AIR aqhnm SM 6 languages to represent voice variability Female and male voices for each language 12 sounds 20 listeners answered Conclusions + Perceived quality ahm-air is almost perfect Compared to SM: stable frequency tracks in ahm Compared to aqhnm: no noise model in ahm, also more stable G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
18 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
19 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
20 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
21 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test Less parameters than aqhnm and SM G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
22 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test Less parameters than aqhnm and SM Future works Forthcoming paper with more evaluations, parameters accuracy, etc. The good resynthesis quality is promising before starting to build higher level models (e.g. spectral envelopes) G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
23 G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11
HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH. George P. Kafentzis and Yannis Stylianou
HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH George P. Kafentzis and Yannis Stylianou Multimedia Informatics Lab Department of Computer Science University of Crete, Greece ABSTRACT In this paper,
More informationIntroducing COVAREP: A collaborative voice analysis repository for speech technologies
Introducing COVAREP: A collaborative voice analysis repository for speech technologies John Kane Wednesday November 27th, 2013 SIGMEDIA-group TCD COVAREP - Open-source speech processing repository 1 Introduction
More informationDecomposition of AM-FM Signals with Applications in Speech Processing
University of Crete Department of Computer Science Decomposition of AM-FM Signals with Applications in Speech Processing (Philosophy of Doctoral) Yannis Pantazis Heraklion Summer 2010 Department of Computer
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationHIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING
HIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING Jeremy J. Wells, Damian T. Murphy Audio Lab, Intelligent Systems Group, Department of Electronics University of York, YO10 5DD, UK {jjw100
More informationHungarian Speech Synthesis Using a Phase Exact HNM Approach
Hungarian Speech Synthesis Using a Phase Exact HNM Approach Kornél Kovács 1, András Kocsor 2, and László Tóth 3 Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University
More informationExploiting the Sparsity of the Sinusoidal Model Using Compressed Sensing for Audio Coding
Author manuscript, published in "SPARS'09 - Signal Processing with Adaptive Sparse Structured Representations (2009)" Exploiting the Sparsity of the Sinusoidal Model Using Compressed Sensing for Audio
More informationACCURATE SPEECH DECOMPOSITION INTO PERIODIC AND APERIODIC COMPONENTS BASED ON DISCRETE HARMONIC TRANSFORM
5th European Signal Processing Conference (EUSIPCO 007), Poznan, Poland, September 3-7, 007, copyright by EURASIP ACCURATE SPEECH DECOMPOSITIO ITO PERIODIC AD APERIODIC COMPOETS BASED O DISCRETE HARMOIC
More informationAudio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands
Audio Engineering Society Convention Paper Presented at the th Convention May 5 Amsterdam, The Netherlands This convention paper has been reproduced from the author's advance manuscript, without editing,
More informationFull-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids
applied sciences Article Full-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids Marcelo Caetano 1, *, George P. Kafentzis 2, Athanasios Mouchtaris 2,3 and
More informationA Parametric Model for Spectral Sound Synthesis of Musical Sounds
A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick
More informationADAPTIVE NOISE LEVEL ESTIMATION
Proc. of the 9 th Int. Conference on Digital Audio Effects (DAFx-6), Montreal, Canada, September 18-2, 26 ADAPTIVE NOISE LEVEL ESTIMATION Chunghsin Yeh Analysis/Synthesis team IRCAM/CNRS-STMS, Paris, France
More informationMETHODS FOR SEPARATION OF AMPLITUDE AND FREQUENCY MODULATION IN FOURIER TRANSFORMED SIGNALS
METHODS FOR SEPARATION OF AMPLITUDE AND FREQUENCY MODULATION IN FOURIER TRANSFORMED SIGNALS Jeremy J. Wells Audio Lab, Department of Electronics, University of York, YO10 5DD York, UK jjw100@ohm.york.ac.uk
More informationIMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey
Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical
More informationOn a Sturm Liouville Framework for Continuous and Discrete Frequency Modulation
On a Sturm Liouville Framework for Continuous and Discrete Frequency Modulation (Invited Paper Balu Santhanam, Dept. of E.C.E., University of New Mexico, Albuquerque, NM: 873 Email: bsanthan@ece.unm.edu
More informationRECOMMENDATION ITU-R F *, ** Signal-to-interference protection ratios for various classes of emission in the fixed service below about 30 MHz
Rec. ITU-R F.240-7 1 RECOMMENDATION ITU-R F.240-7 *, ** Signal-to-interference protection ratios for various classes of emission in the fixed service below about 30 MHz (Question ITU-R 143/9) (1953-1956-1959-1970-1974-1978-1986-1990-1992-2006)
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationModulation is the process of impressing a low-frequency information signal (baseband signal) onto a higher frequency carrier signal
Modulation is the process of impressing a low-frequency information signal (baseband signal) onto a higher frequency carrier signal Modulation is a process of mixing a signal with a sinusoid to produce
More informationSingle-channel and Multi-channel Sinusoidal Audio Coding Using Compressed Sensing
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 Single-channel and Multi-channel Sinusoidal Audio Coding Using Compressed Sensing Anthony Griffin*, Toni Hirvonen, Christos Tzagkarakis, Athanasios
More informationRec. ITU-R F RECOMMENDATION ITU-R F *,**
Rec. ITU-R F.240-6 1 RECOMMENDATION ITU-R F.240-6 *,** SIGNAL-TO-INTERFERENCE PROTECTION RATIOS FOR VARIOUS CLASSES OF EMISSION IN THE FIXED SERVICE BELOW ABOUT 30 MHz (Question 143/9) Rec. ITU-R F.240-6
More informationFREQUENCY-DOMAIN TECHNIQUES FOR HIGH-QUALITY VOICE MODIFICATION. Jean Laroche
Proc. of the 6 th Int. Conference on Digital Audio Effects (DAFx-3), London, UK, September 8-11, 23 FREQUENCY-DOMAIN TECHNIQUES FOR HIGH-QUALITY VOICE MODIFICATION Jean Laroche Creative Advanced Technology
More informationSinusoidal Modelling in Speech Synthesis, A Survey.
Sinusoidal Modelling in Speech Synthesis, A Survey. A.S. Visagie, J.A. du Preez Dept. of Electrical and Electronic Engineering University of Stellenbosch, 7600, Stellenbosch avisagie@dsp.sun.ac.za, dupreez@dsp.sun.ac.za
More informationFrequency slope estimation and its application for non-stationary sinusoidal parameter estimation
Frequency slope estimation and its application for non-stationary sinusoidal parameter estimation Preprint final article appeared in: Computer Music Journal, 32:2, pp. 68-79, 2008 copyright Massachusetts
More information8.3 Basic Parameters for Audio
8.3 Basic Parameters for Audio Analysis Physical audio signal: simple one-dimensional amplitude = loudness frequency = pitch Psycho-acoustic features: complex A real-life tone arises from a complex superposition
More informationAhoTransf: A tool for Multiband Excitation based speech analysis and modification
AhoTransf: A tool for Multiband Excitation based speech analysis and modification Ibon Saratxaga, Inmaculada Hernáez, Eva avas, Iñai Sainz, Ier Luengo, Jon Sánchez, Igor Odriozola, Daniel Erro Aholab -
More informationIEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 8, NOVEMBER /$ IEEE
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 8, NOVEMBER 2009 1483 A Multichannel Sinusoidal Model Applied to Spot Microphone Signals for Immersive Audio Christos Tzagkarakis,
More informationTIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis
TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis Cornelia Kreutzer, Jacqueline Walker Department of Electronic and Computer Engineering, University of Limerick, Limerick,
More informationSingle-channel Mixture Decomposition using Bayesian Harmonic Models
Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,
More informationScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech
More informationAdaptive noise level estimation
Adaptive noise level estimation Chunghsin Yeh, Axel Roebel To cite this version: Chunghsin Yeh, Axel Roebel. Adaptive noise level estimation. Workshop on Computer Music and Audio Technology (WOCMAT 6),
More informationNon-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment
Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,
More informationPerception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.
Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions
More informationADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL
ADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL José R. Beltrán and Fernando Beltrán Department of Electronic Engineering and Communications University of
More informationHILBERT SPECTRAL ANALYSIS OF VOWELS USING INTRINSIC MODE FUNCTIONS. Phillip L. De Leon
HILBERT SPECTRAL ANALYSIS OF VOWELS USING INTRINSIC MODE FUNCTIONS Steven Sandoval Arizona State University School of Elect., Comp. and Energy Eng. Tempe, AZ, U.S.A. spsandov@asu.edu Phillip L. De Leon
More informationSound Synthesis Methods
Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like
More informationPhase estimation in speech enhancement unimportant, important, or impossible?
IEEE 7-th Convention of Electrical and Electronics Engineers in Israel Phase estimation in speech enhancement unimportant, important, or impossible? Timo Gerkmann, Martin Krawczyk, and Robert Rehr Speech
More informationSpecial Session: Phase Importance in Speech Processing Applications
Special Session: Phase Importance in Speech Processing Applications Pejman Mowlaee, Rahim Saeidi, Yannis Stylianou Signal Processing and Speech Communication (SPSC) Lab, Graz University of Technology Speech
More informationPVSOLA: A PHASE VOCODER WITH SYNCHRONIZED OVERLAP-ADD
PVSOLA: A PHASE VOCODER WITH SYNCHRONIZED OVERLAP-ADD Alexis Moinet TCTS Lab. Faculté polytechnique University of Mons, Belgium alexis.moinet@umons.ac.be Thierry Dutoit TCTS Lab. Faculté polytechnique
More informationMonophony/Polyphony Classification System using Fourier of Fourier Transform
International Journal of Electronics Engineering, 2 (2), 2010, pp. 299 303 Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationFrequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement
Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement 1 Zeeshan Hashmi Khateeb, 2 Gopalaiah 1,2 Department of Instrumentation
More informationIMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR
IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR Tomasz Żernici, Mare Domańsi, Poznań University of Technology, Chair of Multimedia Telecommunications and Microelectronics, Polana 3, 6-965, Poznań,
More informationPitch and Harmonic to Noise Ratio Estimation
Friedrich-Alexander-Universität Erlangen-Nürnberg Lab Course Pitch and Harmonic to Noise Ratio Estimation International Audio Laboratories Erlangen Prof. Dr.-Ing. Bernd Edler Friedrich-Alexander Universität
More informationTE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION
TE 302 DISCRETE SIGNALS AND SYSTEMS Study on the behavior and processing of information bearing functions as they are currently used in human communication and the systems involved. Chapter 1: INTRODUCTION
More informationADDITIVE synthesis [1] is the original spectrum modeling
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 3, MARCH 2007 851 Perceptual Long-Term Variable-Rate Sinusoidal Modeling of Speech Laurent Girin, Member, IEEE, Mohammad Firouzmand,
More informationTwo-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling
Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Mikko Parviainen 1 and Tuomas Virtanen 2 Institute of Signal Processing Tampere University
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationCarrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm
Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm Seare H. Rezenom and Anthony D. Broadhurst, Member, IEEE Abstract-- Wideband Code Division Multiple Access (WCDMA)
More informationIdentification of Nonstationary Audio Signals Using the FFT, with Application to Analysis-based Synthesis of Sound
Identification of Nonstationary Audio Signals Using the FFT, with Application to Analysis-based Synthesis of Sound Paul Masri, Prof. Andrew Bateman Digital Music Research Group, University of Bristol 1.4
More informationApplying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016
INTERSPEECH 1 September 8 1, 1, San Francisco, USA Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 1 Fernando Villavicencio
More informationSinusoidal Modeling. summer 2006 lecture on analysis, modeling and transformation of audio signals
Sinusoidal Modeling summer 2006 lecture on analysis, modeling and transformation of audio signals Axel Röbel Institute of communication science TU-Berlin IRCAM Analysis/Synthesis Team 25th August 2006
More informationINSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING DESA-2 AND NOTCH FILTER. Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA
INSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING AND NOTCH FILTER Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA Tokyo University of Science Faculty of Science and Technology ABSTRACT
More informationAN ANALYSIS OF ITERATIVE ALGORITHM FOR ESTIMATION OF HARMONICS-TO-NOISE RATIO IN SPEECH
AN ANALYSIS OF ITERATIVE ALGORITHM FOR ESTIMATION OF HARMONICS-TO-NOISE RATIO IN SPEECH A. Stráník, R. Čmejla Department of Circuit Theory, Faculty of Electrical Engineering, CTU in Prague Abstract Acoustic
More informationData Transmission. ITS323: Introduction to Data Communications. Sirindhorn International Institute of Technology Thammasat University ITS323
ITS323: Introduction to Data Communications Sirindhorn International Institute of Technology Thammasat University Prepared by Steven Gordon on 23 May 2012 ITS323Y12S1L03, Steve/Courses/2012/s1/its323/lectures/transmission.tex,
More informationRobust Voice Activity Detection Based on Discrete Wavelet. Transform
Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper
More informationAudio Imputation Using the Non-negative Hidden Markov Model
Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationSpectrum. Additive Synthesis. Additive Synthesis Caveat. Music 270a: Modulation
Spectrum Music 7a: Modulation Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD) October 3, 7 When sinusoids of different frequencies are added together, the
More informationSub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech
Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech Vikram Ramesh Lakkavalli, K V Vijay Girish, A G Ramakrishnan Medical Intelligence and Language Engineering (MILE) Laboratory
More informationEffects of Reverberation on Pitch, Onset/Offset, and Binaural Cues
Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation
More informationFundamental frequency estimation of speech signals using MUSIC algorithm
Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,
More informationURBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. Audio DSP basics. Paris Smaragdis. paris.cs.illinois.
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Audio DSP basics Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Overview Basics of digital audio Signal representations
More informationFrequency slope estimation and its application for non-stationary sinusoidal parameter estimation
Frequency slope estimation and its application for non-stationary sinusoidal parameter estimation Axel Roebel To cite this version: Axel Roebel. Frequency slope estimation and its application for non-stationary
More informationSpectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition
Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium
More informationVIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering
VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,
More informationSINUSOIDAL MODELING. EE6641 Analysis and Synthesis of Audio Signals. Yi-Wen Liu Nov 3, 2015
1 SINUSOIDAL MODELING EE6641 Analysis and Synthesis of Audio Signals Yi-Wen Liu Nov 3, 2015 2 Last time: Spectral Estimation Resolution Scenario: multiple peaks in the spectrum Choice of window type and
More informationChange Point Determination in Audio Data Using Auditory Features
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 0, VOL., NO., PP. 8 90 Manuscript received April, 0; revised June, 0. DOI: /eletel-0-00 Change Point Determination in Audio Data Using Auditory Features
More informationLab10: FM Spectra and VCO
Lab10: FM Spectra and VCO Prepared by: Keyur Desai Dept. of Electrical Engineering Michigan State University ECE458 Lab 10 What is FM? A type of analog modulation Remember a common strategy in analog modulation?
More informationLecture 6: Nonspeech and Music
EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 1 Music & nonspeech Dan Ellis Michael Mandel 2 Environmental Sounds Columbia
More informationROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE
- @ Ramon E Prieto et al Robust Pitch Tracking ROUST PITCH TRACKIN USIN LINEAR RERESSION OF THE PHASE Ramon E Prieto, Sora Kim 2 Electrical Engineering Department, Stanford University, rprieto@stanfordedu
More informationMusic Technology Group, Universitat Pompeu Fabra, Barcelona, Spain {jordi.bonada,
GENERATION OF GROWL-TYPE VOICE QUALITIES BY SPECTRAL MORPHING Jordi Bonada Merlijn Blaauw Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain Email: {jordi.bonada, merlijn.blaauw}@up.edu
More informationAspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta
Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied
More informationBook Chapters. Refereed Journal Publications J11
Book Chapters B2 B1 A. Mouchtaris and P. Tsakalides, Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications, in New Directions in Intelligent Interactive Multimedia,
More informationEE4512 Analog and Digital Communications Chapter 6. Chapter 6 Analog Modulation and Demodulation
Chapter 6 Analog Modulation and Demodulation Chapter 6 Analog Modulation and Demodulation Amplitude Modulation Pages 306-309 309 The analytical signal for double sideband, large carrier amplitude modulation
More informationMODAL ANALYSIS OF IMPACT SOUNDS WITH ESPRIT IN GABOR TRANSFORMS
MODAL ANALYSIS OF IMPACT SOUNDS WITH ESPRIT IN GABOR TRANSFORMS A Sirdey, O Derrien, R Kronland-Martinet, Laboratoire de Mécanique et d Acoustique CNRS Marseille, France @lmacnrs-mrsfr M Aramaki,
More informationPerception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence
More informationComplex Sounds. Reading: Yost Ch. 4
Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency
More informationSynthesis Techniques. Juan P Bello
Synthesis Techniques Juan P Bello Synthesis It implies the artificial construction of a complex body by combining its elements. Complex body: acoustic signal (sound) Elements: parameters and/or basic signals
More informationIMPROVED HIDDEN MARKOV MODEL PARTIAL TRACKING THROUGH TIME-FREQUENCY ANALYSIS
Proc. of the 11 th Int. Conference on Digital Audio Effects (DAFx-8), Espoo, Finland, September 1-4, 8 IMPROVED HIDDEN MARKOV MODEL PARTIAL TRACKING THROUGH TIME-FREQUENCY ANALYSIS Corey Kereliuk SPCL,
More informationOn the glottal flow derivative waveform and its properties
COMPUTER SCIENCE DEPARTMENT UNIVERSITY OF CRETE On the glottal flow derivative waveform and its properties A time/frequency study George P. Kafentzis Bachelor s Dissertation 29/2/2008 Supervisor: Yannis
More informationIN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1135 Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation Guoning Hu and DeLiang Wang, Fellow, IEEE Abstract
More informationAudio Enhancement Using Remez Exchange Algorithm with DWT
Audio Enhancement Using Remez Exchange Algorithm with DWT Abstract: Audio enhancement became important when noise in signals causes loss of actual information. Many filters have been developed and still
More informationL19: Prosodic modification of speech
L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture
More informationA NOVEL VOICED SPEECH ENHANCEMENT APPROACH BASED ON MODULATED PERIODIC SIGNAL EXTRACTION. Mahdi Triki y, Dirk T.M. Slock Λ
A NOVEL VOICED SPEECH ENHANCEMENT APPROACH BASED ON MODULATED PERIODIC SIGNAL EXTRACTION Mahdi Triki y, Dirk T.M. Slock Λ y CNRS, Communication Systems Laboratory Λ Eurecom Institute 9 route des Crêtes,
More informationChapter 3 Data Transmission COSC 3213 Summer 2003
Chapter 3 Data Transmission COSC 3213 Summer 2003 Courtesy of Prof. Amir Asif Definitions 1. Recall that the lowest layer in OSI is the physical layer. The physical layer deals with the transfer of raw
More informationPERIODIC SIGNAL MODELING FOR THE OCTAVE PROBLEM IN MUSIC TRANSCRIPTION. Antony Schutz, Dirk Slock
PERIODIC SIGNAL MODELING FOR THE OCTAVE PROBLEM IN MUSIC TRANSCRIPTION Antony Schutz, Dir Sloc EURECOM Mobile Communication Department 9 Route des Crêtes BP 193, 694 Sophia Antipolis Cedex, France firstname.lastname@eurecom.fr
More informationReading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.
L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are
More informationOutline. Communications Engineering 1
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband channels Signal space representation Optimal
More informationPerceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter
Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Sana Alaya, Novlène Zoghlami and Zied Lachiri Signal, Image and Information Technology Laboratory National Engineering School
More informationTHE HUMANISATION OF STOCHASTIC PROCESSES FOR THE MODELLING OF F0 DRIFT IN SINGING
THE HUMANISATION OF STOCHASTIC PROCESSES FOR THE MODELLING OF F0 DRIFT IN SINGING Ryan Stables [1], Dr. Jamie Bullock [2], Dr. Cham Athwal [3] [1] Institute of Digital Experience, Birmingham City University,
More informationFriedrich-Alexander Universität Erlangen-Nürnberg. Lab Course. Pitch Estimation. International Audio Laboratories Erlangen. Prof. Dr.-Ing.
Friedrich-Alexander-Universität Erlangen-Nürnberg Lab Course Pitch Estimation International Audio Laboratories Erlangen Prof. Dr.-Ing. Bernd Edler Friedrich-Alexander Universität Erlangen-Nürnberg International
More informationAdvanced audio analysis. Martin Gasser
Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high
More informationNOISE ESTIMATION IN A SINGLE CHANNEL
SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina
More informationThe psychoacoustics of reverberation
The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control
More informationPerception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,
More informationECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2
ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre
More informationTwo-Dimensional Wavelets with Complementary Filter Banks
Tendências em Matemática Aplicada e Computacional, 1, No. 1 (2000), 1-8. Sociedade Brasileira de Matemática Aplicada e Computacional. Two-Dimensional Wavelets with Complementary Filter Banks M.G. ALMEIDA
More informationCOMBINING ADVANCED SINUSOIDAL AND WAVEFORM MATCHING MODELS FOR PARAMETRIC AUDIO/SPEECH CODING
17th European Signal Processing Conference (EUSIPCO 29) Glasgow, Scotland, August 24-28, 29 COMBINING ADVANCED SINUSOIDAL AND WAVEFORM MATCHING MODELS FOR PARAMETRIC AUDIO/SPEECH CODING Alexey Petrovsky
More informationSignal Characterization in terms of Sinusoidal and Non-Sinusoidal Components
Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components Geoffroy Peeters, avier Rodet To cite this version: Geoffroy Peeters, avier Rodet. Signal Characterization in terms of Sinusoidal
More informationCS 591 S1 Midterm Exam
Name: CS 591 S1 Midterm Exam Spring 2017 You must complete 3 of problems 1 4, and then problem 5 is mandatory. Each problem is worth 25 points. Please leave blank, or draw an X through, or write Do Not
More information