APPLICATION OF THE FAN-CHIRP TRANSFORM TO HYBRID SINUSOIDAL+NOISE MODELING OF POLYPHONIC AUDIO

Similar documents
Additive Synthesis, Amplitude Modulation and Frequency Modulation

Overlapping Signal Separation in DPX Spectrum Based on EM Algorithm. Chuandang Liu 1, a, Luxi Lu 1, b

Adaptive Harmonic IIR Notch Filter with Varying Notch Bandwidth and Convergence Factor

POWER QUALITY ASSESSMENT USING TWO STAGE NONLINEAR ESTIMATION NUMERICAL ALGORITHM

Non-Linear Weighting Function for Non-stationary Signal Denoising

Ruohua Zhou, Josh D Reiss ABSTRACT KEYWORDS INTRODUCTION

Alternative Encoding Techniques for Digital Loudspeaker Arrays

DSI3 Sensor to Master Current Threshold Adaptation for Pattern Recognition

New Adaptive Linear Combination Structure for Tracking/Estimating Phasor and Frequency of Power System

TESTING OF ADCS BY FREQUENCY-DOMAIN ANALYSIS IN MULTI-TONE MODE

LOW COST PRODUCTION PHASE NOISE MEASUREMENTS ON MICROWAVE AND MILLIMETRE WAVE FREQUENCY SOURCES

Relation between C/N Ratio and S/N Ratio

Windowing High-Resolution ADC Data Part 2

Spectral analysis of biosignals. Biosignal processing I, S Autumn 2017

EQUALIZED ALGORITHM FOR A TRUCK CABIN ACTIVE NOISE CONTROL SYSTEM

ROBUST UNDERWATER LOCALISATION OF ULTRA LOW FREQUENCY SOURCES IN OPERATIONAL CONTEXT

ADAPTIVE NOISE LEVEL ESTIMATION

ELEC2202 Communications Engineering Laboratory Frequency Modulation (FM)

Evaluation of Steady-State and Dynamic Performance of a Synchronized Phasor Measurement Unit

Radar Imaging of Non-Uniformly Rotating Targets via a Novel Approach for Multi-Component AM-FM Signal Parameter Estimation

A Robust Noise Spectral Estimation Algorithm for Speech Enhancement in Voice Devices

Selective Harmonic Elimination for Multilevel Inverters with Unbalanced DC Inputs

Quality-enhanced Voice Morphing using Maximum Likelihood Transformations

Statistical Singing Voice Conversion with Direct Waveform Modification based on the Spectrum Differential

Speech Enhancement using Temporal Masking and Fractional Bark Gammatone Filters

Keywords Frequency-domain equalization, antenna diversity, multicode DS-CDMA, frequency-selective fading

NONLINEAR WAVELET PACKET DENOISING OF IMPULSIVE VIBRATION SIGNALS NIKOLAOS G. NIKOLAOU, IOANNIS A. ANTONIADIS

PREDICTING SOUND LEVELS BEHIND BUILDINGS - HOW MANY REFLECTIONS SHOULD I USE? Apex Acoustics Ltd, Gateshead, UK

Detection of Faults in Power System Using Wavelet Transform and Independent Component Analysis

Track-Before-Detect for an Active Towed Array Sonar

Design of an Arrayed Waveguide Grating with flat spectral response

HIGH FREQUENCY LASER BASED ACOUSTIC MICROSCOPY USING A CW GENERATION SOURCE

An orthogonal multi-beam based MIMO scheme. for multi-user wireless systems

Kalman Filtering for NLOS Mitigation and Target Tracking in Indoor Wireless Environment

A Preprocessing Method to Increase High Frequency Response of A Parametric Loudspeaker

The NU-NAIST voice conversion system for the Voice Conversion Challenge 2016

Direct F 0 Control of an Electrolarynx based on Statistical Excitation Feature Prediction and its Evaluation through Simulation

A soft decision decoding of product BCH and Reed-Müller codes for error control and peak-factor reduction in OFDM

A Novel NLOS Mitigation Approach for Wireless Positioning System

Adaptive noise level estimation

A HIGH POWER FACTOR THREE-PHASE RECTIFIER BASED ON ADAPTIVE CURRENT INJECTION APPLYING BUCK CONVERTER

This file is part of the following reference: Access to this file is available from:

SIG: Signal-Processing

SECURITY AND BER PERFORMANCE TRADE-OFF IN WIRELESS COMMUNICATION SYSTEMS APPLICATIONS

Fundamental study for measuring microflow with Michelson interferometer enhanced by external random signal

HIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING

Robust Acceleration Control of Electrodynamic Shaker Using µ Synthesis

SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum

EXPERIMENTAL VERIFICATION OF SINUSOIDAL APPROXIMATION IN ANALYSIS OF THREE-PHASE TWELVE-PULSE OUTPUT VOLTAGE TYPE RECTIFIERS

UNIT - II CONTROLLED RECTIFIERS (Line Commutated AC to DC converters) Line Commutated Converter

A NEW APPROACH TO UNGROUNDED FAULT LOCATION IN A THREE-PHASE UNDERGROUND DISTRIBUTION SYSTEM USING COMBINED NEURAL NETWORKS & WAVELET ANALYSIS

IMPROVED CODING OF TONAL COMPONENTS IN MPEG-4 AAC WITH SBR

EFFECTS OF MASKING ANGLE AND MULTIPATH ON GALILEO PERFORMANCES IN DIFFERENT ENVIRONMENTS

Cross-correlation tracking for Maximum Length Sequence based acoustic localisation

LUENBERGER ALGORITHM BASED HARMONICS ESTIMATOR FOR FRONT END RECTIFIER AND PWM-VSI

Statistical Singing Voice Conversion based on Direct Waveform Modification with Global Variance

Experiment 7: Frequency Modulation and Phase Locked Loops October 11, 2006

OTC Statistics of High- and Low-Frequency Motions of a Moored Tanker. sensitive to lateral loading such as the SAL5 and

Transmit Power and Bit Allocations for OFDM Systems in a Fading Channel

Iterative Receiver Signal Processing for Joint Mitigation of Transmitter and Receiver Phase Noise in OFDM-Based Cognitive Radio Link

Model Development for the Wideband Vehicle-to-vehicle 2.4 GHz Channel

Performance Analysis of Atmospheric Field Conjugation Adaptive Arrays

Notes on Orthogonal Frequency Division Multiplexing (OFDM)

ECE 6560 Multirate Signal Processing Analysis & Synthesis Notes

FORWARD MASKING THRESHOLD ESTIMATION USING NEURAL NETWORKS AND ITS APPLICATION TO PARALLEL SPEECH ENHANCEMENT

NINTH INTERNATIONAL CONGRESS ON SOUND AND VIBRATION, ICSV9 PASSIVE CONTROL OF LAUNCH NOISE IN ROCKET PAYLOAD BAYS

Comparison Between PLAXIS Output and Neural Network in the Guard Walls

A New Simple Model for Land Mobile Satellite Channels

Power Improvement in 64-Bit Full Adder Using Embedded Technologies Er. Arun Gandhi 1, Dr. Rahul Malhotra 2, Er. Kulbhushan Singla 3

Precise Indoor Localization System For a Mobile Robot Using Auto Calibration Algorithm

Block Diagram of FM Receiver

Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment

Department of Mechanical and Aerospace Engineering, Case Western Reserve University, Cleveland, OH, 2

Phase Noise Modelling and Mitigation Techniques in OFDM Communications Systems

CONFIDENCE FEATURES EXTRACTION FOR WYNER-ZIV VIDEO DECODING

Efficient Non-linear Changed Mel-filter Bank VAD Algorithm

Improved Codebook-based Speech Enhancement based on MBE Model

Novel Multilevel Inverter Carrier-Based PWM Method

Intermediate-Node Initiated Reservation (IIR): A New Signaling Scheme for Wavelength-Routed Networks with Sparse Conversion

ACCURATE DISPLACEMENT MEASUREMENT BASED ON THE FREQUENCY VARIATION MONITORING OF ULTRASONIC SIGNALS

Optical Magnetic Response in a Single Metal Nanobrick. Jianwei Tang, Sailing He, et al.

Comparison of Fourier Bessel (FB) and EMD-FB Based Noise Removal Techniques for Underwater Acoustic Signals

Power Optimal Signaling for Fading Multi-access Channel in Presence of Coding Gap

Keywords: Equivalent Instantaneous Inductance, Finite Element, Inrush Current.

Sound recording with the application of microphone arrays

Improving FFT Frequency Measurement Resolution by Parabolic and Gaussian Spectrum Interpolation

This is an author-deposited version published in: Eprints ID: 5737

A.C. FUNDA- MENTALS. Learning Objectives

Improving FFT Frequency Measurement Resolution by Parabolic and Gaussian Spectrum Interpolation

Mitigation of GPS L 2 signal in the H I observation based on NLMS algorithm Zhong Danmei 1, a, Wang zhan 1, a, Cheng zhu 1, a, Huang Da 1, a

Mismatch error correction for time interleaved analog-to-digital converter over a wide frequency range

RAKE Receiver. Tommi Heikkilä S Postgraduate Course in Radio Communications, Autumn II.

Multicarrier Interleave-Division Multiple Access Communication in Multipath Channels

Parameter Identification of Transfer Functions Using MATLAB

Compensated Single-Phase Rectifier

Spectrum Sensing in Low SNR: Diversity Combining and Cooperative Communications

Research Article Novel Design for Reduction of Transformer Size in Dynamic Voltage Restorer

Ignition and monitoring technique for plasma processing of multicell superconducting radio frequency cavities

ELECTROMAGNETIC COVERAGE CALCULATION IN GIS

] (1) Problem 1. University of California, Berkeley Fall 2010 EE142, Problem Set #9 Solutions Prof. Jan Rabaey

Transcription:

6th European Signal Processing Conference (EUSIPCO 8), Lausanne, Switzerland, August 5-9, 8, copyright by EURASIP APPLICATION OF THE FAN-CHIRP TRANSFORM TO HYBRID SINUSOIDAL+NOISE MODELING OF POLYPHONIC AUDIO Maciej Bartowia Chair of Multiedia Telecounications and Microelectronics, Poznan University of Technology Polana 3, 6-965, Poznan, Poland phone: + (48-6) 665385, fax: + (48-6) 6653899, eail: bartow@ultiedia.edu.pl web: www.ultiedia.edu.pl ABSTRACT Reliable classification of spectral peas as tonal and noiserelated is an iportant stage of hybrid sinusoidal+noise odeling. Spectral peas of higher haronics are often issed due to their wide frequency spread resulting fro pitch variation. Recently introduced fan-chirp transfor allows for copensating the changes of fundaental frequency in the process of spectral analysis of speech and haronic sounds. In case of polyphonic audio the fundaental is often not unique and/or is hard to estiate. We propose a siple technique for estiation of chirp rates fro already detected partials to iprove the detection of higher haronics through the application of frequency warping and fan-chirp analysis.. INTRODUCTION Sinusoidal odeling is a well established signal processing fraewor applicable to speech and audio analysis, enhanceent, restoration, source separation, autoatic recognition, wateraring, copression, and synthesis []. Sinusoidal+noise (SN) odeling is an iportant eber of the faily of hybrid techniques that use different odels to efficiently represent different classes of signal coponents. Within a SN odel, a short segent of audio data is odeled as a su of quasi-sinusoids with continuously varying agnitudes and frequencies (called the deterinistic coponen, and a stochastic coponent (noise), whose shorttie power spectra envelope changes over tie, xˆ ( K t A ( sin f ( ) d. () ϕ + π τ τ + hn ( ξ( 443 44444 444444 3 noise = = coponent deterinistic coponent In fact, this distinction is not as uch critical fro the perceptual point of view, as it is iportant due to the representation efficiency (in applications related to copression) and flexibility (in applications involving sound anipulations). In general, the separation of the tonal (sinusoidal) and stochastic (noise) coponent is a difficult proble. First of all, the bul of spectral coponents observed in natural audio exhibit only certain degree of coherence in tie evolution of phase and instantaneous frequency. Consequently, ost of the is neither purely sinusoidal nor purely rando. A coon approach to the separation proble is to odel the greater possible part of the signal energy by the deterinistic coponent, under certain constraints (e.g. f being a haronic series [], that strongly narrows the range of applications). A residual signal is obtained by plain (tiedoain) or spectral subtraction of the reconstructed sinusoids fro the original signal. It is subsequently odeled as the stochastic coponent. A ore flexible approach is to perfor a classification of spectral peas (lobes surrounding local axia of the agnitude short tie spectru) into tonal and non-tonal according to their shape. For exaple, Rodet [3] proposes a easure of sinusoidality based on coplex cross-correlation of the short tie spectra and the DFT of the analysis window. This approach is liited to stationary sinusoids, whereas tie-varying coponents often exist in natural audio (fig.). x 4.5.5.5.5.5 3 Tie [s] Figure Narrowband spectrogra of an exaple usic excerpt showing a significant frequency spread of energy related to higher haronics due to pitch variations. An analysis window of 496 saples is necessary here to resolve low frequency partials Lagrange et al [4] estiate the degree of local aplitude and frequency odulation using the tie-frequency reassignent ethod of Auger and Flandrin [5]. Subsequently, individual spectral peas are cross-correlated with a DFT of a distorted window function, and the degree of sinusoidality is deterined and used in pea classification. Zivanovic et al [6,7] developed a pea classification syste based on several local

6th European Signal Processing Conference (EUSIPCO 8), Lausanne, Switzerland, August 5-9, 8, copyright by EURASIP spectru descriptors: noralized bandwidth (NBD), noralized duration (NDD), frequency coherence (FCD). The distinction between sinusoidal peas (ain and side lobes) and noise is done upon the inspection of descriptor cobined values. The fundaental proble with all the approaches entioned above is that they wor under assuption that tonal energy anifests in the short tie spectru as a distinct pea, allowing a siple detection. In practice, such assuption hardly holds in case of instruents with free intonation (such as violin, trobone, etc), as shown in fig., because variations of pitch cause the energy of higher partials to be spread over a wide frequency range and utually overlap. The traditional DFT-based ML estiation ethod often fails at the tas of usical spectru analysis due to inappropriate underlying odel that assues local stationarity of partials. Musical scales of any bass instruents start at 7Hz, the coonly used range begins at about 45Hz. High spectral resolution necessary for proper analysis of low pitched sounds requires the use of long DFT windows (6-s, i.e. - saples if f s = 44.Hz) in order to reliably resolve individual partials (cf fig. ). In a typical situation, instantaneous frequencies of partials change significantly during such a long period, thus they are no ore observable as narrow spectral lines. Hence, it is reasonable to see for locally-adaptive TF analysis ethods [5,8,9] that coonly attept at odeling the non-stationary spectral content on a chirp basis. Aong any chirp transfors and chirp estiation techniques proposed hitherto which often exhibit high coputational coplexity, the Fan-chirp transfor (FChT) introduced by Kepesi and Weruaga [,] offers two fundaental advantages in the context of usic analysis. It allows for siultaneous adapting to the pitch variations of all haronics of given sound, and its coputational coplexity is very low, enabling online processing. Developed priarily for the analysis of speech, FChT coputes the spectru of a signal on the set of basis functions with fan-lie geoetry in the tie-frequency plane. The short-tie fan-chirp transfor (STFChT) is defined as N α X (, α) = x( n) φ = α '( n) exp, () n j π φ N where φ α (n) is a tie-frequency warping operator, ( n) φ α ( n ) = ( +.5 α ( n N) )n, (3) and α is the sew paraeter corresponding to the chirp rate. In fact, the STFChT of a given signal is equivalent to the DFT of the sae signal sapled on a non-unifor grid obtained by inverting the warping operator (3). Therefore a fast ipleentation is possible which requires just a resapling step followed by an FFT []. Since the apping (3) is bijective in [..N], the transfor is reversible, provided no aliasing ters are introduced in the process of resapling. These aliasing ters ay be avoided by appropriate upsapling of the original signal prior to warping.. MODELING OF POLYPHONIC MUSIC. The proble of fundaental frequency STFChT is able to resolve haronic partials whose frequency deviation within the analysis window is greater than spacing between corresponding ean frequencies. It is possible under the condition that an appropriate value of α is used, that corresponds to the rate of change of the fundaental frequency, and α < /N. In the context of speech analysis, it ay be approxiated as (4) f'( f( n + ) f( n ) α =, (4) f ( f ( n) where f (n) denotes a fundaental frequency estiated within a syetric tie window centered around n. Several techniques for the FChT-supported estiation of fundaental using either inter-frae or intra-frae approach are described in []. In the context of polyphonic usic, f is not unique due to the presence of ultiple sounds of different pitch, often generated by different instruents. The issue of ultiple pitch estiation fro polyphonic audio has been addressed by any researchers (e.g. [,3]) and is generally considered as a difficult tas. Furtherore, soe usical instruents (lie bells, glocenspiel or Rhodes piano) exhibit strongly inharonic spectra, therefore their fundaental is undefined. It is iportant to note however, that even without a strictly defined fundaental all the sinusoidal partials of pitched sounds follow a siilar pattern in the tie-frequency plane. Considering partials of a haronically rich sound, their individual chirp rate estiates taen relative to their ean frequency estiates are strictly related to the pitch change rate. Therefore, instead of (4), α ay be estiated fro the statistics of individual chirp rates α of soe lower frequency partials detected before calculating the FChT. It is a feasible solution, since low partials usually exhibit ore stable frequencies and are relatively easy to detect.. Estiation of individual partials Partials with a liited depth of frequency odulation ay be often (but not always) odeled as linear chirps. It is possible to estiate their ean frequency and individual chirp rate by using one of several techniques developed for sinusoidal odeling. For exaple, Abe and Sith [4] deonstrated that for a chirp expressed as ( γ t + j( ϕ + ω t + β )) x( = A exp t, (5) weighted by a Gaussian window (as well as other windows), a non-zero frequency odulation ter β results in a quadratic shape of log aplitude and phase spectra. They proposed a quadratic-interpolated FFT ethod for estiating the ω and β, π b ωˆ =, d β ˆ = p, (6) N a a fro the paraeters of a parabola fitted to the log agnitude and phase spectru surrounding peas,

6th European Signal Processing Conference (EUSIPCO 8), Lausanne, Switzerland, August 5-9, 8, copyright by EURASIP where a = b = d = ( log X log log )/ + X + X ( log X log )/ + X ( X X + X )/ +, (7) π d p = (8) N a + b and is the index of FFT bin corresponding to local axiu of agnitude..3 Chirp rate estiation for groups of partials We propose a two-stage analysis procedure for sinusoidal odeling of polyphonic usic. The ain idea is to perfor a standard analysis first, with the use of DFT for the detection of reliable low frequency partials and estiation of their paraeters ω and β. Subsequently, the non-stationary high frequency partials are detected and their paraeters are estiated by the use of FChFT analysis, taing into account several ost interesting values of chirp rate α, i.e. those values that ost probably correspond to the local tiefrequency sewness related to the underlying pitch odulation. Let assue sounds coing fro different instruents with different pitch variation are present siultaneously in the current analysis frae. Its spectru shows a ixture of haronic and inharonic series of partials. We observe that the estiates of individual chirp rates α = β /(ω ) of individual partials follow a ulti-odal distribution that ay be approxiated by a Gaussian ixture odel (GMM), w ( α φ ) p p α( α) =, where (9a) w ( α µ ) p, and (9b) ( α φ ) = σ π exp σ φ denotes a certain state of the odel representing a group of partials sharing a coon chirp rate. The weights w are not explicitly nown, but ay be regarded as representing the bul of partials exhibiting siilar teporal evolution, thus they ay be estiated fro the heights of the epirical distribution odes. Our ai is to find the values of µ which are the interesting chirp rates that ay reveal additional high frequency partials due to the tie-frequency warping inherent in the FChFT. We estiate the values of µ by eploying an iterative algorith based on the Expectation Maxiization ethod. The algorith starts with a classical sinusoidal analysis of a given audio frae with an optional pea verification in order to reject peas induced by noise [6,7]. Initially, for each frae we gather the observed values of α =β /(ω ) and for a pdf estiate (fig. ) by the use of a histogra soothing ethod [5]. Locations of peas of this pdf estiate are the initial estiates of µ which ay be iteratively iproved as follows: For each saple of α calculate its distance to every µ. Calculate new estiations of µ through weighted averaging the values of α with the weights inversely proportional to the distances. Iterate until there is no significant change in µ. Results of such iterations (fig. ) are the values of the chirp rates that ay be applied within the second stage eploying FChFT analysis for enhanced estiation of non-stationary high-frequency partials. We have observed experientally that for real world usic the values of α are usually constrained in the range of <- >, and ost often do not exceed.5. PDF α estiate..5..5 Gaussian Mixture Model -.5 -.4 -.3 -. -....3.4.5 α.5 -.5-4 6 8 4 Frae No. Figure Above: distribution of the estiated values of α for a single frae of the test signal (fig. ). Below: estiated values of α in consecutive fraes..4 FChFT-based usic analysis Music spectru analysis with the FChFT transfor offers the possibility to reveal otherwise hidden spectral peas related to non-stationary high frequency partials. It also offers an enhanced estiation of the paraeters of lower partials due to the frequency deviations being copensated by the tie-frequency sew inherent in the fan-chirp basis functions. Thans to the chirp rates α being estiated in the first stage of the proposed technique (sec..3), it is necessary to calculate the FChFT only for those few values of α, which is a coputationaly feasible operation. Our pea detection and estiation algorith depends on the observation that for each sinusoidal partial with varying frequency the highest value of corresponding pea in the agnitude spectru is offered by the output of the FChFT with such value of the α paraeter that is closest to the firstorder approxiation of the real frequency variation function. In other words, the closest is the chirp rate used in the fanchirp transfor to the real frequency change rate, the ore is the spectru siilar to a spectru of a sinusoid. The algorith for the analysis is very straightforward:. For a data segent x of N saples, initialize a vector of peas P[] with N/ zeros. Also, insert pea values estiated fro the DFT analysis in the first stage into the locations corresponding to the DFT bin nubers.. For the first candidate value of α estiated as described in.3, calculate the result of FChFT(x,α).

6th European Signal Processing Conference (EUSIPCO 8), Lausanne, Switzerland, August 5-9, 8, copyright by EURASIP 3. Find all sinusoidal peas in the FChFT output, according to the chosen pea detection criteria. 4. For each of those peas copare their agnitude to the agnitude of corresponding pea already gathered in the vector P. If the agnitude is higher, it eans that a better approxiation of corresponding partial is found. In such case, replace the existing pea with the new pea fro the FChFT result. Also, collect the neighboring spectral data and write it to the entries of P. Label the pea with the current chirp rate, α. 5. Iterate steps..4 with subsequent values of α fro the set. 6. For each of the peas gathered in the vector P, calculate the corrected ω and β, according to (6-8). Correct the value of β by taing into account the chirp rate α used for the particular pea. Note that the above procedure does not guarantee that all hidden partials are detected. Unfortunately, soe groups of highly non-stationary sinusoids ay be issed if none of the have been detected in the first stage so that it could contribute to the estiation of optial sew paraeter α. 3. EXPERIMENTAL RESULTS. Synthetic signal In order to verify the procedure proposed in section.3, a siple test has been set up. An artificial signal has been constructed by suing two inharonic spectra of two bell sounds with deeply odulated pitch, synthesized using the FM synthesis technique (fig. 3). As it can be easily observed, the deep frequency deviation causes ost of the high frequency partials to be blurred to a significant degree. Clearly, this signal spectru contains at least two groups of partials and the distribution of α should reveal in each frae at least two odes of the pdf, corresponding to the different frequency odulation patterns. Frequency.5.45.4.35.3.5..5..5 3 4 5 6 7 8 9 Tie x 4 Figure 3 Spectrogra of the synthetic signal. The blac frae shows a data segent of N=496 saples, further analyzed in fig. 4 Experients show that for this synthetic signal about lowest haronics are detected reliably in the first stage of sinusoidal analysis. In fact, due to overlapping partials, the estiation of ω and β is not free of errors, therefore the actual values of α are slightly biased. Resulting chirp spectra are shown in fig. 4. Magnitude [db] Magnitude [db] 6 4-4 6 8 4 6 8 6 4-4 6 8 4 6 8 Figure 4 Coparison of standard DFT (above, blac) and FChFT analysis (below) with two values of autoatically estiated α (shown in blue and red). In both cases, the analysis window is Haing, 496 saples. As it can be clearly seen, ost of the high-frequency partials that are entirely indiscernible in the DFT output becoe quite visible in the result of FChFT. It is iportant to note that fan-chirp analysis allowed to discriinate partials that are very close in frequency, but differ ostly in the chirp rate, α.. Analysis of real usic A series of experients with various excerpts of popular and classic usic fro the EBU SQAM reference CD have been perfored in order to verify the effectiveness of the new sinusoidal analysis technique in real-life applications. In each experient, a benchar was created fro the results of standard sinusoidal analysis with an additional pea selection procedure based on spectral descriptors (NBD+FCD). Results of FChT-based analysis copared favorably with the benchar, since any existing partials have been detected in the high frequency range. Moreover, ore robust pea detection due to the chirp analysis allowed for changing the detection thresholds to ore strict setting. Thans to this, there was uch less of false partials detected due to the spectral energy induced by noise. A saple coparison fro these experients is shown in fig. 5. In the upper plot we show the sinusoidal partials detected with the standard DFT ethod followed by pea classification based on spectral descriptors. This typical result reveals serious deficiencies of the analysis technique. Most of the high frequency partials have not been properly detected due to the vibrato odulation, while there are any false ultiple partials in the range of -5Hz induced by the irregular spectral peas which are the side lobes of deeply odulated haronics. It is worth entioning, that a sipler analysis without application of spectral descriptors (not shown here) gives even worse results. In the lower plot, the results of FChFT-based analysis show uch of the partials in

6th European Signal Processing Conference (EUSIPCO 8), Lausanne, Switzerland, August 5-9, 8, copyright by EURASIP the high frequency range being properly detected, and also the nuber of false partials is significantly reduced. 4 x 5. ACKNOWLEDGMENTS This wor was supported by the research grant 3 TD 7 3 of the Polish Ministry of Science and Higher Education..5 REFERENCES.5.5.5 Tie [s].5 3.5.5 Tie [s].5 3 4 x.5 in the detection of highly non-stationary partials is achieved, that enables a good quality odeling of wideband audio, without restrictions regarding haronicity..5 Figure 5 Coparison of sinusoidal partial detection based on standard DFT technique (above) and the proposed technique exploiting fan-chirp transfor analysis (below). These plots should be copared with figure. One significant disadvantage of the proposed new technique for sinusoidal analysis is the additional coputational burden related to the necessary calculations of several fan chirp transfors. However, detection and estiation are usually not very deanding in ters of coputational coplexity, copared to tracing, whose coplexity is often datadependant. Since our analysis results in uch cleaner the data input to the tracing algorith, the total operation speed of a odeling syste ay not increase significantly. 4. CONCLUSIONS A coputationally feasible application of the fan-chirp transfor to hybrid sinusoidal+noise odeling of polyphonic usic have been presented in the paper. A very siple technique has been proposed for estiation of the frequency warping paraeter α that does not require pitch estiation. Experiental results confir, that a substantial iproveent [] J.Beauchap (red), Analysis, Synthesis, and Perception of Musical Sounds: The Sound of Music, Springer, 6. [] X. Serra, J.O.Sith, "Spectral odelling synthesis: A sound analysis/synthesis syste based on deterinistic plus stochastic decoposition", Coputer Music Journal, 4(4), 99, pp. -4. [3] X.Rodet, "Musical sound signal analysis/synthesis: Sinusoidal + residual and eleentary wavefor odels", IEEE Tie-Frequency and Tie-Scale Worshop, TFTS'97, Coventry, UK, August 997. [4] M.Lagrange, S.Marchand, J-B.Rault, "Sinusoidal paraeter extraction and coponent selection in a non-stationary odel", Proc. DAFx', Haburg,, pp. 59-64. [5] F. Auger, P. Flandrin, "Iproving the readability of tiefrequency and tie-scale representations by the reassignent ethod", Proc. ICASSP'95, May 995, vol. 4, pp. 6889. [6] A. Röbel, M.Zivanovic, X.Rodet, "Signal decoposition by eans of classification of spectral peas", Proc. ICMC'4, Miai, 4. [7] M.Zivanovic, A. Röbel, X.Rodet, "Adaptive threshold deterination for spectral pea classification", Proc. DAFx'7, Bordeaux, 7. [8] S.Mann, S.Hayin, "Adaptive chirplet transfor: an adaptive generalization of the wavelet transfor", Optical Engineering, vol.3, no.6, pp. 43-56, June 99. [9] X-G. Xia, "Discrete chirp-fourier transfor and its applications to chirp rate estiation", IEEE Trans. Sig. Proc, vol.48, no., pp. 3-333, Noveber. [] M. Kepesi, L. Weruaga, "Adaptive chirp-based tiefrequency analysis of speech signals", Speech Co., vol.48, pp. 474-49, 6. [] L. Weruaga, M. Kepesi, "The fan-chirp transfor for non-stationary haronic sounds", Signal Proc., vol. 87, pp. 54-5, 7. [] P.J.Walsley, S.J.Godsill, P.J.W.Rayner, "Polyphonic pitch tracing using joint Bayesian estiation of ultiple frae paraeters", Proc. IEEE Worshop on Audio and Acoustics, Mohon, NY State, 999 [3] Y. Chunghsin; A. Röbel, X. Rodet, "Multiple fundaental frequency estiation of polyphonic usic signals", Proc. ICASSP '5, March 5, vol.3, pp. 5-8. [4] M. Abe, J.O. Sith, "Design criteria for the quadratically interpolated FFT ethod (III): Bias due to aplitude and frequency odulation", CCRMA Rep. STAN-M-6, October, 4. [5] W. Hardle, Soothing Techniques: With Ipleentation in S, Springer-Verlag, Berlin, 99