Sound Modeling from the Analysis of Real Sounds

Similar documents
ADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL

Sound Synthesis Methods

A Parametric Model for Spectral Sound Synthesis of Musical Sounds

MODAL ANALYSIS OF IMPACT SOUNDS WITH ESPRIT IN GABOR TRANSFORMS

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS

Direction-Dependent Physical Modeling of Musical Instruments

Compound quantitative ultrasonic tomography of long bones using wavelets analysis

I-Hao Hsiao, Chun-Tang Chao*, and Chi-Jo Wang (2016). A HHT-Based Music Synthesizer. Intelligent Technologies and Engineering Systems, Lecture Notes

Digitalising sound. Sound Design for Moving Images. Overview of the audio digital recording and playback chain

INFLUENCE OF FREQUENCY DISTRIBUTION ON INTENSITY FLUCTUATIONS OF NOISE

INTRODUCTION TO COMPUTER MUSIC PHYSICAL MODELS. Professor of Computer Science, Art, and Music. Copyright by Roger B.

Mel Spectrum Analysis of Speech Recognition using Single Microphone

FOURIER analysis is a well-known method for nonparametric

Digital Signal Processing

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Signals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2

Advanced Audiovisual Processing Expected Background

THE BEATING EQUALIZER AND ITS APPLICATION TO THE SYNTHESIS AND MODIFICATION OF PIANO TONES

Copyright 2009 Pearson Education, Inc.

Scaled Laboratory Experiments of Shallow Water Acoustic Propagation

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

INHARMONIC DISPERSION TUNABLE COMB FILTER DESIGN USING MODIFIED IIR BAND PASS TRANSFER FUNCTION

YAMAHA. Modifying Preset Voices. IlU FD/D SUPPLEMENTAL BOOKLET DIGITAL PROGRAMMABLE ALGORITHM SYNTHESIZER

Principles of Musical Acoustics

Joint Time/Frequency Analysis, Q Quality factor and Dispersion computation using Gabor-Morlet wavelets or Gabor-Morlet transform

HIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING

arxiv: v2 [cs.sd] 18 Dec 2014

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum

FEM Approximation of Internal Combustion Chambers for Knock Investigations

Chapter 2 Channel Equalization

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment

Computer Audio. An Overview. (Material freely adapted from sources far too numerous to mention )

Measuring impulse responses containing complete spatial information ABSTRACT

Three Modeling Approaches to Instrument Design

The influence of plectrum thickness on the radiated sound of the guitar

REAL-TIME BROADBAND NOISE REDUCTION

Chapter 4 SPEECH ENHANCEMENT

Linear Systems. Claudia Feregrino-Uribe & Alicia Morales-Reyes Original material: Rene Cumplido. Autumn 2015, CCC-INAOE

FIR/Convolution. Visulalizing the convolution sum. Convolution

Fundamentals of Digital Audio *

8.3 Basic Parameters for Audio

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

Chapter 2. Meeting 2, Measures and Visualizations of Sounds and Signals

Chapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves

METHODS FOR SEPARATION OF AMPLITUDE AND FREQUENCY MODULATION IN FOURIER TRANSFORMED SIGNALS

CSC475 Music Information Retrieval

TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis

Computer Music in Undergraduate Digital Signal Processing

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

A Novel Adaptive Algorithm for

Distortion products and the perceived pitch of harmonic complex tones

Measurement System for Acoustic Absorption Using the Cepstrum Technique. Abstract. 1. Introduction

Automatic Calibration of Modified FM Synthesis to Harmonic Sounds using Genetic Algorithms

Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich *

Speech Synthesis using Mel-Cepstral Coefficient Feature

Department of Electronic Engineering NED University of Engineering & Technology. LABORATORY WORKBOOK For the Course SIGNALS & SYSTEMS (TC-202)

3D Distortion Measurement (DIS)

NOISE ESTIMATION IN A SINGLE CHANNEL

System analysis and signal processing

SHAKER TABLE SEISMIC TESTING OF EQUIPMENT USING HISTORICAL STRONG MOTION DATA SCALED TO SATISFY A SHOCK RESPONSE SPECTRUM

3rd European Conference on Underwater Acoustics Heraklion, Crète GREECE June 1996

EE 6422 Adaptive Signal Processing

SHAKER TABLE SEISMIC TESTING OF EQUIPMENT USING HISTORICAL STRONG MOTION DATA SCALED TO SATISFY A SHOCK RESPONSE SPECTRUM Revision C

Identification of Nonstationary Audio Signals Using the FFT, with Application to Analysis-based Synthesis of Sound

Digitally controlled Active Noise Reduction with integrated Speech Communication

Active Control of Energy Density in a Mock Cabin

Whole geometry Finite-Difference modeling of the violin

Music. Sound Part II

Fundamentals of Time- and Frequency-Domain Analysis of Signal-Averaged Electrocardiograms R. Martin Arthur, PhD

Introduction to Wavelet Transform. Chapter 7 Instructor: Hossein Pourghassem

FIR/Convolution. Visulalizing the convolution sum. Frequency-Domain (Fast) Convolution

Advanced audio analysis. Martin Gasser

Real time noise-speech discrimination in time domain for speech recognition application

Sound Source Localization using HRTF database

Wavelet Transform. From C. Valens article, A Really Friendly Guide to Wavelets, 1999

High-speed Noise Cancellation with Microphone Array

Acoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018

Kent Bertilsson Muhammad Amir Yousaf

Synthesis Techniques. Juan P Bello

Direct Imaging of Group Velocity Dispersion Curves in Shallow Water Christopher Liner*, University of Houston; Lee Bell and Richard Verm, Geokinetics

CMPT 468: Delay Effects

Sound, acoustics Slides based on: Rossing, The science of sound, 1990.

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION

Developing a Versatile Audio Synthesizer TJHSST Senior Research Project Computer Systems Lab

Fundamentals of Music Technology

What is Sound? Part II

Multicomponent Multidimensional Signals

Implementation of decentralized active control of power transformer noise

Statistical analysis of nonlinearly propagating acoustic noise in a tube

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Seismic application of quality factor estimation using the peak frequency method and sparse time-frequency transforms

Blind Dereverberation of Single-Channel Speech Signals Using an ICA-Based Generative Model

Subband coring for image noise reduction. Edward H. Adelson Internal Report, RCA David Sarnoff Research Center, Nov

Lecture 2: Acoustics

Comparison of Multirate two-channel Quadrature Mirror Filter Bank with FIR Filters Based Multiband Dynamic Range Control for audio

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

Finite Word Length Effects on Two Integer Discrete Wavelet Transform Algorithms. Armein Z. R. Langi

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

Transcription:

Sound Modeling from the Analysis of Real Sounds S lvi Ystad Philippe Guillemain Richard Kronland-Martinet CNRS, Laboratoire de Mécanique et d'acoustique 31, Chemin Joseph Aiguier, 13402 Marseille cedex 20 France {ystad, guillem, kronland }@alphalma.cnrs-mrs.fr Abstract This work addresses sound modeling using a combination of physical and signal models with a particular emphasis on the flute sound. For that purpose, analysis methods adapted to the nonstationary nature of sounds are developed. Further on parameters characterizing the sound from a perceptive and a physical point of view are extracted. The synthesis process is then designed to reproduce a perceptive effect and to simulate the physical behavior sound generating system. The correspondence between analysis and synthesis parameters is crucial and can be achieved using both mathematical and perceptive criteria. Real-time control of such models makes it possible to use specially designed interfaces mirroring already existing sound generators like traditional musical instruments. 1 Introduction The concept of sound modeling is introduced, followed by a brief discussion of analysis techniques taking into account the dynamic and the spectral behavior of sounds. Physical models simulating the wave propagation in unidimentional bounded media (strings and tubes) are made to simulate transient sounds. A so-called waveguide model consisting in a delay line and a loop filter is used for this purpose. The loop filter takes into account dispersive and dissipative effects due to the medium in which the waves propagate. A description construction filter taking into account these important phenomena is given. For sustained sounds the source should be modeled separately. For this purpose, the source and the resonator have been separated by deconvolution. By using an adaptive filtering method such as the LMS algorithm, the source signal is decomposed in two contributions: a deterministic component and a stochastic component. The modeling deterministic part, whose behavior generally is nonlinear, necessitates the use of global synthesis methods like waveshaping and, in the flute case, perceptive criteria such as the Tristimulus criterion. The stochastic component of a flute source is modeled by taking into account the probability density function and the power spectral density process. An example of real-time control of a flute model is presented. A flute equipped with sensors is used as an interface to control the proposed model. Possibilities of intimate sound manipulations obtained by acting on the parameters model are discussed. 2 The concept of sound modeling Sound modeling consists in constructing synthesis models to make resynthesis and transformations of natural sounds. For that, we have to design analysis methods which give representations real sounds. Parameters can further be extracted from these representations to feed synthesis models. 2.1 Analysis Since sounds generally are non-stationary, and since the evolution of a sound as a function of time is important from a perceptive point of view, timefrequency representations should be used. For that we used linear representations such as the Gabor and the Wavelet transforms which consist in decomposing the signal into elementary functions. These analysis methods give time-frequency representations which can be divided into two parts - one corresponding to the modulus representation and the other to its phase [5]. In order to extract parameters from this representation we should use special methods like the spectral line estimation method [3] or a matched analysis method [11]. One can then associate one amplitude and one frequency modulation law to each spectral component sound. 2.2 Synthesis Synthesis models can be divided into two main groups; signal models and physical models. Signal models consist in reproducing a perceptive effect

using mathematical description while physical models consist in giving a physical description sound generating system. In this paper a combination of a signal model and a physical model has been used to model sustained sounds. 2.2.1 Modeling transient sounds Physical models aim at simulating the behavior of existing or virtual sound sources. We have chosen to use the so-called waveguide synthesis models [8] which have the advantage of being easy to construct with a behavior close to that of a real sound generator. We here give a brief description method applied to the tube case by considering the solution wave equation [11]. In order to make a model which takes into account the theoretical phenomena, a propagative model shown in Figure 1 was constructed. This model is a generalization so-called wave-guide model which was first proposed by Karplus and Strong [4]. Input Delay 2D Filter (disp., att. & B.C.) Output Figure 1. Waveguide model The delay line in the model corresponds to the time the waves need to propagate back and forth in the resonator. The filter takes into account dispersion, attenuation and boundary conditions in the medium. Although there already exists algorithms for constructing filters taking into account dispersion and dissipation effects [10], these algorithms are based on approximations and are not precise enough for resynthesis purposes where we want to reconstruct a sound which resembles as much as possible to the original sound. This is why we decided to design a new method for constructing a filter which consists in comparing the waveguide model with the theoretical response. The method which was used for that purpose is based on the minimization energy difference in a neighborhood resonant peaks between the response model and the response real system [11]. The corresponding impulse response is then constructed in a way similar to the inversion of a time-frequency representation. Actually it consists in summing up elementary functions adjusted and positioned in the timefrequency plane, along a curve representing the group delay. This method makes it possible to construct filters which have the exact values of dissipation and dispersion at the resonance peaks. This gives good results for resynthesis. 2.2.2 Modeling sustained sounds The model for transient sounds is not sufficient when we want to simulate sustained sounds like wind instruments sounds. We here give an example of how to model the source of a flute. Although several attempts have been made to propose physical models of such a source, this is still an open problem. Since our aim was to simulate its perceptive effects, we decided to use a signal model for this purpose. In order to extract the source from the the rest signal we assumed that the source and the resonator could be separated. Although this is not correct from a physical point of view, we shall see that it is a good assumption in this case since the aim is to reconstruct a perceptive effect. The source signal was extracted by deconvolution and then divided into a deterministic and a stochastic part which were modeled separately. 2.2.2.1 Source identification As seen in the previous section, a physical model corresponding to the resonator instrument can be constructed by a filter and a delay line. The transfer function resonant system is type: 1 H(ω) = 1 F(ω )e iωd and corresponds to an all-pole filter. This means that its inverse exists, and that the deconvolution between the real sound and the resonator therefore is a legal mathematical operation. The source signal x(t) is given by x(t) = (y * h 1 )(t) Figure 2 shows the spectrum of a flute sound obtained this way. Figure 2. Spectrum deconvoluted signal As we can see the source mainly consists of two contributions, a stochastic part corresponding to the noise in the signal and a deterministic part which is a sum of spectral components. To model the source signal we therefore decided to split it into a

deterministic and a stochastic part and to model them independently. In order to split the two contributions, we used the so called LMS (Least Mean Square) algorithm which consists in using an adaptive filter the aim of which is to remove from an input signal all the components which are correlated to a reference signal [9]. 2.2.2.2 Modeling the deterministic part of the source The evolution source spectra for different dynamic levels show that the spectra do not evolve in a linear way, since the increase in the dynamic level do not correspond to a global amplification spectrum. This non-linear behavior source is common for a lot of musical instrument, and it is of great importance from a perceptive point of view, since it means that the timbre flute sound changes when the dynamic level sound changes. To model this non-linear behavior we used a waveshaping synthesis method which has been developed by Arfib and Le Brun [1][6], and which consists in constructing a signal by a non-linear function g the argument of which is a monochromatic signal with amplitude I(t) called the index of distortion. This index has an important influence on the spectrum signal. The waveshaping synthesis method makes it possible to generate a wanted spectrum for a given index. Now the great challenge is to find out how the waveshaping index should vary in order to get an evolution synthetic spectrum which corresponds to the evolution real spectrum for different dynamic levels. Since the spectral evolution cannot mathematically be modelled by the waveshaping synthesis, we used perceptive criteria to find the variation range for the index. The most well known perceptive criterion is probably the spectral centroid criterion which has been proposed by Beauchamp [2] and which is directly related to the brightness sound. This method is satisfying when all the spectral components change with respect to the dynamic level. In the flute case mainly the first four spectral components evolve with the dynamic level, which means that the spectral centroid changes very little although there are great changes between these components. This is the reason why the spectral centroid criterion does not work on flute sounds. In order to find a criterion which was more adapted to the flute case, we used the tristimulus criterion which has been proposed by Pollard and Jansson [7]. This criterion consists in cutting the total loudness of the sound into three contributions; one (N1) which takes into account the fundamental component, another(n2) which takes into account the second to fourth components and a third(n3) which takes into account the rest components. The tristimulus is then given by the three normalized contributions so that the sum m is one; This means that the tristimulus can be plotted in a two dimensional diagram where the x axis (corresponding to the normalized high frequency contribution) is the abscise and the y axis (corresponding to the mid-frequency partials) is the ordinate and where the fundamental contribution is implicit. The tristimulus corresponding to the flute case and to a matched waveshaping sound are shown in Figure 3. Figure 3. Tristimulus diagram with synthesized (*) and real flute sounds with different dynamic levels. By fitting the two curves, we then find a correspondence between the index and the driving pressure. The index varies linearly with the logarithm driving pressure. 2.2.2.3 Modeling the stochastic part source To model the stochastic part source we assume that the process is stationary and ergodic. This means that we can characterize the noise by its probability density function and its power spectral density. In the flute case the probability density function follows an exponential law [11]: while the power spectral density corresponds to a low-pass filtered noise. The combination deterministic and the stochastic part source gives the complete source model. 3 A hybrid model The signal model simulating the source combined with the physical model simulating the resonator of the instrument gives a general hybrid model which can be applied to several instruments. In this section we shall see how the hybrid model can be applied to the flute case. In order to give the possibility to play with this model, a convenient interface should be found to pilot the real-time model.

Figure 4. Close view of a flute with magnets and sensors By choosing a traditional instrument connected to a computer by magnetic sensors detecting the finger position and a microphone at the embouchure level detecting the pressure variations, the musicians can make use playing techniques already aquired. The interface is illustrated in Figure 4. As already mentioned, sound modeling does not only consist in resynthesizing sounds, but also in doing intimate transformations on the sound without being constrained by the mechanics instrument. This is the most interesting part digital instrument. Thus, in order to do sound transformations on this instrument, we should act on the different parameters model as illustrated in Figure 5. non-linear function driving pressure Log of waveshaping index distorsion function deterministic finger position speed of motion Vibrato estimation vibrato External source noise generation noise characteristics noise vibrato frequency noise from key-pads key-pad noise key-pad noise Delay line Loop filter loop filter sound Figure 5. The hybrid flute model. Control model s parameters. The bold and italic indications show the modification possibilities. 4 Conclusion In this paper we have shown how to make a sound model using a combination of signal and physical models. The physical models take into account the most relevant physical characteristics sound generating system, while the signal models take into account perceptive effects. A way of designing physical models simulating the resonator instrument has been described. This model takes into account both dispersion and dissipation phenomena occurring during propagation in the medium. Such effects are important from a perceptive point of view. Then signal models were used to model the source of the instrument which was extracted from the sound by a deconvolution method. We further proposed to split the source signal into a deterministic and a stochastic part by the LMS algorithm, and to model these contributions independently. The deterministic part was modeled by a waveshaping synthesis method in order to take into account the nonlinearities source signal. Perceptive criteria were then used to find the parameters to feed the synthesis model. The stochastic part was easy to implement thanks to the separation between the source and the resonator. It can be modeled by linear filtering of a white noise. In fact the stochastic part of the flute sound is colored due to the fact that a noise is propagating in the resonator. As mentioned in the beginning, sound modeling consists in both resynthesis and transformation of natural sounds. We have therefore shown how sounds can be manipulated by the proposed model and how these manipulations can be done in real-time and piloted by an interface which we designed. References [1] Arfib, D. Digital synthesis of complex spectra by means of multiplication of non-linear distorted sine waves. Journal Audio Engineering Society, 1979, 27, pp. 757-768. [2] Beauchamp, J. W. Synthesis by Spectral Amplitude and Brightness Matching of Analyzed Musical Instrument Tones. Journal of the Audio Engineering Society, 1982, Vol. 30, No. 6. [3] Guillemain, P. Analyse et modelisation de signaux sonores par des représentations tempsfréquence linéaires. PhD thesis, Université Aix- Marseille II, juin 1994. [4] Karplus, K. & Strong, A. Digital Synthesis of Plucked String and Drum Timbres. Computer Music Journal, 1983, 2 (7): 43-55. [5] Kronland-Martinet, R., Morlet, J., & Grossmann, A. Analysis of sound patterns through wavelet transforms. International Journal of Pattern Recognition and Artificial Intelligence, 1987, 11 No. 2, pp. 97-126. [6] Le Brun, M. Digital waveshaping synthesis. Journal Audio Engineering Society, 1979, 27, pp. 250-266.

[7] Pollard H.F. & Jansson E.V. A Tristimulus Method for the Specification of Musical Timbre. Acoustica, 1982, Vol. 51. [8] Smith, J.O. Physical modeling using digital waveguides. Computer Music Journal, 1992, 16 No. 4, pp. 74-91. [9] Widrow B. & Stearns S.D. Adaptive Signal Processing. Englewood Cliffs, Prentice-Hall Inc., 1985. [10] Yegnanarayana B. Design of recursive groupdelay filters by autoregressive modeling. IEEE Trans. of Acoust. Sp. and Sig. Proc. ASSP-30, 632-637. [11] Ystad S. Sound Modeling using a combination of physical and signal models. PhD thesis, Université Aix-Marseille II, march 1998.