Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Similar documents
Rhythm Analysis in Music

Rhythm Analysis in Music

Audio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23

MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES. P.S. Lampropoulou, A.S. Lampropoulos and G.A.

Rhythm Analysis in Music

Drum Transcription Based on Independent Subspace Analysis

Tempo and Beat Tracking

Lecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

A SEGMENTATION-BASED TEMPO INDUCTION METHOD

Music Signal Processing

Exploring the effect of rhythmic style classification on automatic tempo estimation

Tempo and Beat Tracking

Real-time beat estimation using feature extraction

COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester

8.3 Basic Parameters for Audio

Advanced audio analysis. Martin Gasser

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Introduction of Audio and Music

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Advanced Music Content Analysis

SOUND SOURCE RECOGNITION AND MODELING

University of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015

Speech/Music Change Point Detection using Sonogram and AANN

Isolated Digit Recognition Using MFCC AND DTW

Change Point Determination in Audio Data Using Auditory Features

Research on Extracting BPM Feature Values in Music Beat Tracking Algorithm

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle

Musical tempo estimation using noise subspace projections

Transcription of Piano Music

Chapter 4 SPEECH ENHANCEMENT

Survey Paper on Music Beat Tracking

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Basic Characteristics of Speech Signal Analysis

Applications of Music Processing

Speech and Music Discrimination based on Signal Modulation Spectrum.

Michael Clausen Frank Kurth University of Bonn. Proceedings of the Second International Conference on WEB Delivering of Music 2002 IEEE

Automatic classification of traffic noise

Sound Recognition. ~ CSE 352 Team 3 ~ Jason Park Evan Glover. Kevin Lui Aman Rawat. Prof. Anita Wasilewska

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

MULTIPLE F0 ESTIMATION IN THE TRANSFORM DOMAIN

Keywords: spectral centroid, MPEG-7, sum of sine waves, band limited impulse train, STFT, peak detection.

Singing Expression Transfer from One Voice to Another for a Given Song

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

DISCRETE FOURIER TRANSFORM AND FILTER DESIGN

Auditory Based Feature Vectors for Speech Recognition Systems

Onset Detection Revisited

An Optimization of Audio Classification and Segmentation using GASOM Algorithm

Comparison of Spectral Analysis Methods for Automatic Speech Recognition

An Improved Voice Activity Detection Based on Deep Belief Networks

High capacity robust audio watermarking scheme based on DWT transform

CHORD DETECTION USING CHROMAGRAM OPTIMIZED BY EXTRACTING ADDITIONAL FEATURES

An experimental comparison of audio tempo induction algorithms

EVALUATION OF MFCC ESTIMATION TECHNIQUES FOR MUSIC SIMILARITY

MOST MODERN automatic speech recognition (ASR)

Campus Location Recognition using Audio Signals

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Assistant Lecturer Sama S. Samaan

Modulation Spectrum Power-law Expansion for Robust Speech Recognition

Understanding Digital Signal Processing

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

Query by Singing and Humming

Onset detection and Attack Phase Descriptors. IMV Signal Processing Meetup, 16 March 2017

Timbral Distortion in Inverse FFT Synthesis

Audio Signal Compression using DCT and LPC Techniques

A CONSTRUCTION OF COMPACT MFCC-TYPE FEATURES USING SHORT-TIME STATISTICS FOR APPLICATIONS IN AUDIO SEGMENTATION

Automatic Lyrics Alignment for Cantonese Popular Music

DERIVATION OF TRAPS IN AUDITORY DOMAIN

JOURNAL OF OBJECT TECHNOLOGY

A multi-class method for detecting audio events in news broadcasts

Classification of vocalizations of killer whales using dynamic time warping

AUTOMATED MUSIC TRACK GENERATION

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

CHORD RECOGNITION USING INSTRUMENT VOICING CONSTRAINTS

Complex Sounds. Reading: Yost Ch. 4

PARAMETER IDENTIFICATION IN RADIO FREQUENCY COMMUNICATIONS

Automatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS

SGN Audio and Speech Processing

Evaluation of MFCC Estimation Techniques for Music Similarity Jensen, Jesper Højvang; Christensen, Mads Græsbøll; Murthi, Manohar; Jensen, Søren Holdt

DISCRIMINATION OF SITAR AND TABLA STROKES IN INSTRUMENTAL CONCERTS USING SPECTRAL FEATURES

6.555 Lab1: The Electrocardiogram

ECC419 IMAGE PROCESSING

Gammatone Cepstral Coefficient for Speaker Identification

Speech Signal Analysis

FFT analysis in practice

Signal Processing Toolbox

AN IMPROVED NO-REFERENCE SHARPNESS METRIC BASED ON THE PROBABILITY OF BLUR DETECTION. Niranjan D. Narvekar and Lina J. Karam

Nonlinear Audio Recurrence Analysis with Application to Music Genre Classification.

Accurate Tempo Estimation based on Recurrent Neural Networks and Resonating Comb Filters

REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO

Long Range Acoustic Classification

Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich *

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Envelope Modulation Spectrum (EMS)

Signal Processing Algorithms for Music, Marine Mammals and Speech

Discrete Fourier Transform (DFT)

Application of Fourier Transform in Signal Processing

Transcription:

Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004 Conclusion 2

Introduction Music can be looked from different aspects: Melody Harmony Rhythm Instrumentation Form Etc.. Rhythms similar / dissimilar Very easy for human - perceptually Not so easy for computer - quantitative measuring 3

Introduction If the rhythmic similarity can be quantitatively measured by computer, so what s the usefulness? Automatic ranking in huge music collection Musical database searching Music context analysis Musical genre classification Etc. 4

Example I Foote s work (2001,2002) Key points Novel approach to characterize the rhythm and tempo of music Beat Spectrum Beat Spectrogram Measure the rhythmic similarity by distance of two beat spectra Foote 2001, 2002 5

Calculate Beat Spectrum Extract feature vectors from the audio stream 256 samples frame wide 50% overlapping FFT and Power spectrum Cosine distances of all pairwise combinations of feature vectors Foote 2001, 2002 6

Similarity Matrix A matrix S is constructed by all distance values in a signal Visualization: whiter regions = higher similarity Foote 2001, 2002 7

Deriving the Beat Spectrum Beat Spectrum B(l) is a measure of self-similarity as a function of the time lag A simple estimation: summing S along the diagonal: More robust estimation comes from the autocorrelation of S Beat Spectrogram = Beat Spectrum over successive windows Foote 2001, 2002 8

Measuring Rhythmic Similarity For two pieces, we have two beat spectra B 1 (l) and B 2 (l), where l is lag time (discrete and finite). The Rhythmic Similarity can be measured by the distance of two L-dimensional vectors Squared Euclidean Distance Cosine Distance Cosine Distance of Fourier Beat Spectral Coefficients Others Experiments were designed to evaluate the performance of different distance functions. Foote 2001, 2002 9

Experiments In one experiment, it shows the Euclidean distance is also a measure of tempo difference. In another experiment, it shows the Cosine distance outperforms the squared Euclidean distance Foote 2001, 2002 10

Example II Paulus s work (2002) A system that measures the similarity of two arbitrary rhythmic patterns Preprocessing (optional) Rhythmic pattern segmentation Features extraction Similarity measuring Paulus 2002 11

Pattern Segmenting The amplitude envelop is obtained from the audio stream by a set of processing methods Normalizing, filter bank, half-wave rectify, square, decimation, low-pass, dynamic compression A periodicity analysis algorithm is then performed on the envelop signals to calculate the intermediary signal, which is used for musical meter estimation. Paulus 2002 12

Pattern Segmenting Musical meters are estimated at three levels: Tatum the shortest duration Tactus beat Musical measure Tatum period: S(f) is calculate as the DFT of Tatum period is the inverse of the frequency corresponding to the maximum value of Tactus period and musical measure period are estimated from based on three probability distributions. A list of pattern boundaries are then produced, and one pattern can be isolated for further feature extraction Paulus 2002 13

Feature Extraction Three features are extracted from one pattern which is a series of overlapped frame. Loudness mean square energy of one pattern Brightness spectral centroid (using a logarithmic frequency scale) MFCCs 15 coefficients To avoid the absolute tone color, all features are normalized so that only the up/down deviations are remained Normalized feature matrix Paulus 2002 14

Similarity Measuring Feature vector sets of two rhythmic patterns, F1(i,n) and F2(i,n), are matched by Dynamic Time Warping (DTW) algorithm Dynamic time warping is an algorithm for measuring similarity between two sequences which may vary in time or speed. wikipedia The similarity measure is given by Paulus 2002 15

Results Pattern Segmenting Tactus periods: 67% correct rate Musical measure length: 77% correct rate Similarity Measuring High similarity is assigned to the same rhythms performed with different drum sets 14 rhythmic patterns performed by three different sound sets Paulus 2002 16

Example III Dixon s work Key points: A new way to characterize music by typical barlength rhythmic patterns Using it in music genre classification (ballroom dance music for this paper) Dixon 2004 17

Temporal Sequence Cha Cha above Rumba below The different genres of ballroom dance music are distinguished by the temporal sequence For genre classification purpose, the task is to automatically extract the rhythmic patterns from audio signal and compare the similarities Dixon 2004 18

Main Steps First, the amplitude envelopes are extracted from a number of bar-length patterns Then, by using k-means clustering (k = 4), the most prominent rhythmic pattern is found by the largest cluster For similarity measuring the distance of two patterns can be calculated by Euclidean distance For genre classification the rhythmic pattern of each piece is used as a feature vector Dixon 2004 19

Pattern Examples The amplitude envelope of fifteen bars of a Cha Cha excerpt Color curves are clusters belong to each bar Thick black curve is the largest cluster, defined as the typical pattern Dixon 2004 20

Genre Classification Rhythmic pattern is used, alone or in conjunction with other feature set, for genre classification (dance music) Rhythmic pattern Features derived from rhythmic patterns: Mean amplitude of the pattern Maximum amplitude of the pattern Standard deviation of the pattern Etc. Other automatically calculated feature set: Features derived from the periodicity histogram Features derived from the inter-onset interval histograms Etc. Measured tempo Classification rate 50% - rhythmic pattern used alone (baseline is 16%) 84% - when other automatically calculated features are included 96% - when measured tempo is included Dixon 2004 21

Conclusion Normal distance functions are used in ex. 1 and ex. 3, while in ex. 2, Paulus uses DTW to handle patterns with different lengths. Features extracted from both frequency domain (ex. 1 & ex. 2) and time domain (ex. 3) have been successfully tested Pattern segmentation is not easy (not mentioned in ex.1, but mentioned in ex.2 & 3) Tempo can be important for genre classification 22

References [Foote 2001] Foote, J. and S. Uchihashi. 2001. The Beat Spectrum: A New Approach to Rhythm Analysis. Proceedings of the International Conference on Multimedia and Expo. [Foote 2002] Foote, J., M. Cooper and U. Nam. 2002. Audio Retrieval by Rhythmic Similarity. Proceedings of the 3rd International Symposium on Musical Information Retrieval. [Paulus 2002] Paulus, J., and A. Klapuri. 2002. Measuring the Similarity of Rhythmic Patterns. Proceedings of the 3rd International Symposium on Musical Information Retrieval. [Dixon 2004] S. Dixon, F. Gouyon, and G.Widmer. Towards characterisation of music via rhythmic patterns. In ISMIR,Barcelona, Spain, 2004 23