A Full-Band Adaptive Harmonic Representation of Speech

Size: px

Start display at page:

Download "A Full-Band Adaptive Harmonic Representation of Speech"

Adele Sharyl Barker
6 years ago
Views:

A Full-Band Adaptive Harmonic Representation of Speech Gilles Degottex and Yannis Stylianou {degottex,yannis}@csd.uoc.

1 A Full-Band Adaptive Harmonic Representation of Speech Gilles Degottex and Yannis Stylianou University of Crete - FORTH - Swiss National Science Foundation G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

2 The Sinusoidal and Harmonic Models Amplitude [db] DFT Harmonics Can fit any monophonic signal, we use it for speech The sinusoids can be harmonic, quasi-harmonic, or adaptive... G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

3 Time-Frequency Representations DFT s(t) = K k=0 a k e jφ k (t) φ k (t) = k (2π/K) t Constant frequency basis Time [s] FChT 1 s(t) = K k=0 a k e jφ k (t) φ k (t) = k (2π/K + α t) t Linear frequency basis Time [s] 1 M. Kepesi and L. Weruaga, Adaptive Chirp-based time-frequency analysis of speech signals, Speech communication, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

4 The Adaptive Quasi-Harmonic + Noise Model (aqhnm) 1 We can adapt the frequency basis to follow the frequency tracks Adaptive Quasi-Harmonic Model (aqhm) 1 φ k (t) = 2π f s t 0 f k(τ)dτ For speech representation in the high frequencies Amplitude modulated noise (aqhnm) 2 1 Y. Pantazis, O. Rosec and Y. Stylianou, Adaptive AM-FM Signal Decomposition With Application to Speech Analysis, IEEE Trans. on Audio, Speech, and Language Processing, Y. Pantazis, G. Tzedakis, O. Rosec, Y. Stylianou, Analysis/Synthesis of Speech based on an Adaptive Quasi-Harmonic plus Noise Model, ICASSP, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

5 The new ideas 1) From FChT, harmonics exist in high frequencies Use a full-band representation G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

6 The new ideas 1) From FChT, harmonics exist in high frequencies Use a full-band representation 2) Quasi-harmonicity can be useful for analysis but maybe not necessary for encoding/decoding Use the strict harmonicity and keep the adaptivity aqhnm ahm G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

7 The Adaptive Harmonic Model (ahm) ahm s(t) = K k= K φ k (t) = k 2π f s a k (t) e jφ k (t) t 0 f 0(τ)dτ a k (t) Amplitude and phase (complex-valued function) Interpolated from a t i k at time t i f 0 (t) Fundamental frequency Interpolated from f t i 0 at time t i Parameters at a time t i : {f t i 0, at i k } k {0,..., K i} G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

8 The problem of estimation for full-band representation A small f 0 error propagates by multiplication: f k = k f 0 Amplitude [db] Question How to estimate harmonics up to Nyquist? G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

9 The Adaptive Iterative Refinement (AIR) Assume first the f 0 error is small for low harmonics Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

10 The Adaptive Iterative Refinement (AIR) Then the frequency correction mechanism of QHM 1 can be used Amplitude [db] Y. Pantazis, O. Rosec and Y. Stylianou, Iterative Estimation of Sinusoidal Signal Parameters, IEEE Signal Processing Letters, G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

11 The Adaptive Iterative Refinement (AIR) We can therefore increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

12 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

13 The Adaptive Iterative Refinement (AIR) Increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

14 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

15 The Adaptive Iterative Refinement (AIR) Increase the harmonic level Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

16 The Adaptive Iterative Refinement (AIR) Correct the frequencies Amplitude [db] G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

17 Evaluation: Listening test Impairment Total Male voices Female voices Original ahm AIR aqhnm SM 6 languages to represent voice variability Female and male voices for each language 12 sounds 20 listeners answered Conclusions + Perceived quality ahm-air is almost perfect Compared to SM: stable frequency tracks in ahm Compared to aqhnm: no noise model in ahm, also more stable G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

18 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

19 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

20 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

21 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test Less parameters than aqhnm and SM G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

22 Conclusions Points to remember Adaptive Harmonic Model (ahm) Frequency tracks adapted to the f 0 curve Simple harmonicity Dedicated algorithm, Adaptive Iterative Refinement (AIR), to localize the harmonic structures in the high frequencies Quasi-perfect perceived quality according to a listening test Less parameters than aqhnm and SM Future works Forthcoming paper with more evaluations, parameters accuracy, etc. The good resynthesis quality is promising before starting to build higher level models (e.g. spectral envelopes) G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

23 G. Degottex & Y. Stylianou (UOC/FORTH/SNSF) A Full-Band Adaptive HM of Speech September the 10th / 11

HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH. George P. Kafentzis and Yannis Stylianou

HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH George P. Kafentzis and Yannis Stylianou Multimedia Informatics Lab Department of Computer Science University of Crete, Greece ABSTRACT In this paper,