ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering, Columbia University dpwe@ee.columbia.edu http://www.ee.columbia.edu/~dpwe/e4896/ E4896 Music Signal Processing (Dan Ellis) 213-2-18-1 /16
1. Sinusoidal Modeling Periodic sounds ridges in spectrogram each ridge is a sinusoidal harmonic.. with smoothly-varying parameters Violin.arco.ff.A4.. an efficient & flexible description? E4896 Music Signal Processing (Dan Ellis) 213-2-18-2 /16
Sinusoid Modeling Analogous to Fourier series model harmonics explicitly? e.g. x[n] =... for pitched signal with fundamental k[n] =k [n] n Additional constraints harmonicity smoothness of k a k [n] a k [n]cos( k [n]) [n] Arbitrarily accurate given enough sinusoids E4896 Music Signal Processing (Dan Ellis) 213-2-18 - /16
Examples Using Michael Klingbeil s SPEAR http://www.klingbeil.com/spear/ E4896 Music Signal Processing (Dan Ellis) 213-2-18-4 /16
Envelope Limitations Extracted envelope reflects analysis window.4 2.3.2.1 Frequency 15 1 5 1 2 3 4 5 6 7.4 2.5 1 1.5 2 Time.3.2.1 Frequency 15 1 5 1 2 3 4 5 6 7.5 1 1.5 2 Time Sharp window violates assumptions E4896 Music Signal Processing (Dan Ellis) 213-2-18-5 /16
2. Sinusoidal Analysis Sinusoids = peaks in spectrogram slices = DFT frames X[k, m] = N 1 n= DFT length N window determines frequency resolution: long enough to see harmonics x[n + ml] w[n]e j 2 e.g. 2-3x longest pitch cycle typically 5-1 ms a k [n] but: too long blurs amplitude envelope Hop advance L choose N/2 or N/4.. denser for simpler interpolation along time kn N X(e j ) W (e j ) E4896 Music Signal Processing (Dan Ellis) 213-2-18-6 /16
Sinusoidal Peak Picking Local maxima in DFT frames freq / Hz level / db 8 6 4 2.2.4.6.8.1.12.14.16.18 2-2 -4 Quadratic fit for sub-bin resolution level / db 2 1-1 -2 y ab 2 /4 b/2 y = ax(x-b) phase / rad 4 6 8 freq / Hz x 4 6 8 freq / Hz E4896 Music Signal Processing (Dan Ellis) 213-2-18-7 /16-5 -1 time / s -6 1 2 3 4 5 6 7 freq / Hz
Peak Selection Don t want every peak just true sinusoids threshold? level / db 2-2 -4-6 1 2 3 4 5 6 7 freq / Hz local shape - fits ( ) W (e j ) Look for stability of frequency & amplitude in successive time frames phase derivative in time/freq E4896 Music Signal Processing (Dan Ellis) 213-2-18-8 /16
Track Formation Connect peaks in adjacent frames to form sinusoids can be ambiguous if large frequency changes freq birth existing tracks death new peaks time Unclaimed peak create new track No continuation of track termination hysteresis E4896 Music Signal Processing (Dan Ellis) 213-2-18-9 /16
Pitch Tracking Extracted sinusoids could be anywhere but often expect them to be in harmonic series freq / Hz 6 4 2 freq / Hz 7 65 6.5.1.15.2 time / s 55.5.1.15.2 Find pitch by searching for common factor can then regularize pitch k[n] =k [n] time / s E4896 Music Signal Processing (Dan Ellis) 213-2-18-1/16
3. Sinusoidal Synthesis Each sinusoid track drives an oscillator {a k [n], k[n]} 3 3 level 2 1 7 a k [n] a k [n] cos( k [n] t) 2 1-1 freq / Hz 6 5 k [n].5.1.15.2 n time / s -2-3.5.1.15.2 time / s can interpolate amplitude, frequency samples Faster method synthesizes DFT frames then overlap-add trickier to achieve frequency modulation E4896 Music Signal Processing (Dan Ellis) 213-2-18-11/16
Sinusoidal Modification Sinusoidal description very easy to modify e.g. changing time base of sample points 5 freq / Hz 4 3 2 1 Frequency stretch preserve formant envelope? level / db.5.1.15.2.25.3.35.4.45.5 4 3 2 1 1 2 3 4 freq / Hz 1 2 3 4 E4896 Music Signal Processing (Dan Ellis) 213-2-18-12/16 level / db 4 3 2 1 time / s freq / Hz
4. Noise Residual Some energy is not well fit with sinusoids e.g. noisy energy Can just keep it as residual or model it some other way Leads to sinusoidal + noise model x[n] = a k [n]cos( k [n]n) + e[n] mag / db 2-2 k sinusoids original -4-6 -8 LPC 1 2 3 4 5 6 7 freq / Hz residual E4896 Music Signal Processing (Dan Ellis) 213-2-18-13/16
Sinusoids + Noise Decomposition Removing sines reveals noise & transients Guitar - original 4 3 Frequency 2 1 4.2.4.6.8 1 1.2 1.4 1.6 1.8 2 Time Guitar - sinusoid reconstruction 3 Frequency 2 1 4.2.4.6.8 1 1.2 1.4 1.6 1.8 2 Time Guitar - residual (original - sines) 3 Frequency 2 1.2.4.6.8 1 1.2 1.4 1.6 1.8 2 Time Different representation approaches... E4896 Music Signal Processing (Dan Ellis) 213-2-18-14/16
5. Limitations The spectrogram (mag STFT) is not linear superpositions suffer from phase effects freq / Hz 14 13 abs(stft(s1)) + abs(stft(s2)) 14 13 abs(stft(s1+s2)) 25 2 12 11 12 11 15 1 1 1 9 9 5 8.5 1 8.5 1 time / sec Separating sources is generally hard... parameters tracking E4896 Music Signal Processing (Dan Ellis) 213-2-18-15/16
Summary Spectrogram shows sinusoid harmonics in many sounds Peak picking in spectrogram can effectively extract them Sinusoidal domain extremely flexible for modification Noise residual can add even more realism E4896 Music Signal Processing (Dan Ellis) 213-2-18-16/16