Lecture 3: Audio Applications

Similar documents
Tempo and Beat Tracking

Music Signal Processing

Tempo and Beat Tracking

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Survey Paper on Music Beat Tracking

Lecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)

Rhythm Analysis in Music

LOCAL GROUP DELAY BASED VIBRATO AND TREMOLO SUPPRESSION FOR ONSET DETECTION

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

Rhythm Analysis in Music

Rule-based expressive modifications of tempo in polyphonic audio recordings

REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO

Topic 2. Signal Processing Review. (Some slides are adapted from Bryan Pardo s course slides on Machine Perception of Music)

COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester

A MULTI-MODEL APPROACH TO BEAT TRACKING CONSIDERING HETEROGENEOUS MUSIC STYLES

EVALUATING THE ONLINE CAPABILITIES OF ONSET DETECTION METHODS

Audio processing methods on marine mammal vocalizations

ENHANCED BEAT TRACKING WITH CONTEXT-AWARE NEURAL NETWORKS

Signals and Systems. Lecture 13 Wednesday 6 th December 2017 DR TANIA STATHAKI

TIME-FREQUENCY ANALYSIS OF MUSICAL SIGNALS USING THE PHASE COHERENCE


VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

FFT analysis in practice

Accurate Tempo Estimation based on Recurrent Neural Networks and Resonating Comb Filters

SPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester

Energy-Weighted Multi-Band Novelty Functions for Onset Detection in Piano Music

EE 464 Short-Time Fourier Transform Fall and Spectrogram. Many signals of importance have spectral content that

Performing the Spectrogram on the DSP Shield

Converting Speaking Voice into Singing Voice

MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting

On Minimizing the Look-up Table Size in Quasi Bandlimited Classical Waveform Oscillators

Deep learning architectures for music audio classification: a personal (re)view

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar

Automatic Evaluation of Hindustani Learner s SARGAM Practice

Monophony/Polyphony Classification System using Fourier of Fourier Transform

END-OF-YEAR EXAMINATIONS ELEC321 Communication Systems (D2) Tuesday, 22 November 2005, 9:20 a.m. Three hours plus 10 minutes reading time.

Onset Detection Revisited

ENF ANALYSIS ON RECAPTURED AUDIO RECORDINGS

ENGR 210 Lab 12: Sampling and Aliasing

Advanced Data Analysis Pattern Recognition & Neural Networks Software for Acoustic Emission Applications. Topic: Waveforms in Noesis

SIGNAL CLASSIFICATION BY DISCRETE FOURIER TRANSFORM. Pauli Lallo ABSTRACT

Digital Signal Processing. VO Embedded Systems Engineering Armin Wasicek WS 2009/10

Exploring the effect of rhythmic style classification on automatic tempo estimation

Using Audio Onset Detection Algorithms

Microcomputer Systems 1. Introduction to DSP S

MAKING TRANSIENT ANTENNA MEASUREMENTS

ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION. Frank Kurth, Alessia Cornaggia-Urrigshardt and Sebastian Urrigshardt

Measurement Techniques

Audio Content Analysis. Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly

Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio

Chapter 4. Digital Audio Representation CS 3570

A SEGMENTATION-BASED TEMPO INDUCTION METHOD

Audio Restoration Based on DSP Tools

Localized Robust Audio Watermarking in Regions of Interest

Research on Extracting BPM Feature Values in Music Beat Tracking Algorithm

Harmonic-Percussive Source Separation of Polyphonic Music by Suppressing Impulsive Noise Events

Lesson Plans Contents

A Two-step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis

IMPROVED HIDDEN MARKOV MODEL PARTIAL TRACKING THROUGH TIME-FREQUENCY ANALYSIS

DIGITAL SIGNAL PROCESSING. Chapter 1 Introduction to Discrete-Time Signals & Sampling

Rhythm Analysis in Music

Discrete-time Signals & Systems

SUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle

Advanced Music Content Analysis

Outline. Introduction to Biosignal Processing. Overview of Signals. Measurement Systems. -Filtering -Acquisition Systems (Quantisation and Sampling)

Moving from continuous- to discrete-time

guitarfinetune User's guide Rev Eigil Krogh Sorensen

Signal Processing. Naureen Ghani. December 9, 2017

Chapter 3 Data Transmission COSC 3213 Summer 2003

Dimension Reduction of the Modulation Spectrogram for Speaker Verification

Harmonic Percussive Source Separation

HIGH ACCURACY FRAME-BY-FRAME NON-STATIONARY SINUSOIDAL MODELLING

Module 3 : Sampling and Reconstruction Problem Set 3

ECE 484 Digital Image Processing Lec 09 - Image Resampling

TIME DOMAIN ATTACK AND RELEASE MODELING Applied to Spectral Domain Sound Synthesis

Digital Signal Processing

Complex Sounds. Reading: Yost Ch. 4

Music Instruments That Produce Sounds with Inaudible High-Frequency Components

Advanced audio analysis. Martin Gasser

Chapter 1: Introduction to audio signal processing

applications John Glover Philosophy Supervisor: Dr. Victor Lazzarini Head of Department: Prof. Fiona Palmer Department of Music

Lecture Schedule: Week Date Lecture Title

Data and Computer Communications Chapter 3 Data Transmission

Lecture 9: Time & Pitch Scaling

Introduction of Audio and Music

Drum Transcription Based on Independent Subspace Analysis

Discrete-time Signals & Systems

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Real-time Drums Transcription with Characteristic Bandpass Filtering

CSC475 Music Information Retrieval

TWO-DIMENSIONAL FOURIER PROCESSING OF RASTERISED AUDIO

Nyquist's criterion. Spectrum of the original signal Xi(t) is defined by the Fourier transformation as follows :

Enhanced Waveform Interpolative Coding at 4 kbps

Get Rhythm. Semesterthesis. Roland Wirz. Distributed Computing Group Computer Engineering and Networks Laboratory ETH Zürich

WAVELETS: BEYOND COMPARISON - D. L. FUGAL

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Sampling and Reconstruction of Analog Signals

Continuous vs. Discrete signals. Sampling. Analog to Digital Conversion. CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals

Transcription:

Jose Perea, Michigan State University. Chris Tralie, Duke University 7/20/2016

Table of Contents Audio Data / Biphonation Music Data

Digital Audio Basics: Representation/Sampling 1D time series x[n], sampled at 44100hz Shannon Nyquist: Need to sample at at least twice the highest frequency of a bandlimited signal to avoid aliasing

Digital Audio Basics: Representation/Sampling 1D time series x[n], sampled at 44100hz Shannon Nyquist: Need to sample at at least twice the highest frequency of a bandlimited signal to avoid aliasing Very high sampling rate! 1 second chunk lives in R 44100 3 second chunk lives in R 132300!

Biphonation 2 noncommensurate frequencies present at the same time in biological phenomena e.g. cos(t) + cos(πt)

Horse Whinnies High Valence Negative Briefer, Elodie F., et al. Segregation of information about emotional arousal and valence in horse whinnies. Scientific reports 4 (2015).

Horse Whinnies High Valence Positive Briefer, Elodie F., et al. Segregation of information about emotional arousal and valence in horse whinnies. Scientific reports 4 (2015).

Horse Whinnies High Valence Positive We ll be focusing on the positive clip today... Briefer, Elodie F., et al. Segregation of information about emotional arousal and valence in horse whinnies. Scientific reports 4 (2015).

Horse Whinnie Audio Interactively Show Audio File

Horse Whinnie Audio Interactively Show Audio File Base frequencies on the order of 1000hz (Window size?)

Horse Whinnie Audio Interactively Show Audio File Base frequencies on the order of 1000hz (Window size?) By default, only using 512 samples after the starting time ( 23 milliseconds of audio)

Horse Whinnie Audio Interactively Show Audio File Base frequencies on the order of 1000hz (Window size?) By default, only using 512 samples after the starting time ( 23 milliseconds of audio) Have Students Find Steady State Region

Biphonation Finding Competition Pan through audio file to find best region of biphonation, as measured by persistence of second most persistent class May be corrupted due to noise Will keep a running tab of best score on the board!

Table of Contents Audio Data / Biphonation Music Data

Tempo / Repetition Music is full of repetition

Tempo / Repetition Music is full of repetition Tempo is determined by a train of music pulses / beats in a periodic pattern

Tempo / Repetition Music is full of repetition Tempo is determined by a train of music pulses / beats in a periodic pattern Foot tapping

Tempo / Repetition Music is full of repetition Tempo is determined by a train of music pulses / beats in a periodic pattern Foot tapping Tempo usually 50-200 beats per minute

Tempo / Repetition Don t Stop Believin (120 beats per minute)

Raw Audio Delay Embedding τ dim = 22050 (why?)

Raw Audio Delay Embedding τ dim = 22050 (why?) dt = 441

Raw Audio Delay Embedding τ dim = 22050 (why?) dt = 441 Taking first 3 seconds of audio

Raw Audio Delay Embedding τ dim = 22050 (why?) dt = 441 Taking first 3 seconds of audio Run it! What happens?

Audio Spectrograms: Definition Aka the Squared Magnitude Short-Time Fourier Transform. Given A discrete signal x A window size W (implicitly τ = 1) A hop size H (like dt )

Audio Spectrograms: Definition Aka the Squared Magnitude Short-Time Fourier Transform. Given A discrete signal x A window size W (implicitly τ = 1) A hop size H (like dt ) S[k, n] = FFT x nh nh + 1. nh + W 1 [k] 2

Audio Spectrograms: Definition hop S[k, n] = FFT x nh nh + 1. nh + W 1 [k] 2 Window 1 Window 2 Window 3

Audio Spectrograms

Audio Spectrograms

Audio Spectrograms Look at Journey example, show percussion

Audio Novelty Functions where f [n] = W 1 k=0 s(log(s[k + 1, n]) log(s[k, n])) s(x) = { x x > 0 0 otherwise Indicator function for audio onsets }

Audio Novelty Functions Show module, show Journey example

Audio Novelty Functions Show module, show Journey example By what factor have we reduced the sampling rate?

Audio Novelty Functions Show module, show Journey example By what factor have we reduced the sampling rate? Show synchronized audio

Audio Novelty Functions Lots of variants 1 Ellis, Daniel PW. Beat tracking by dynamic programming. Journal of New Music Research 36.1 (2007): 51-60. 2 Gouyon, Fabien, Simon Dixon, and Gerhard Widmer. Evaluating low-level features for beat classification and tracking. 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP 07. Vol. 4. IEEE, 2007. 3 Boeck, Sebastian, and Gerhard Widmer. Maximum filter vibrato suppression for onset detection. Proceedings of the 16th International Conference on Digital Audio Effects (DAFx-13), Maynooth, Ireland. 2013.

Audio Novelty Functions Lots of variants e.g. in [1] 1 Ellis, Daniel PW. Beat tracking by dynamic programming. Journal of New Music Research 36.1 (2007): 51-60. 2 Gouyon, Fabien, Simon Dixon, and Gerhard Widmer. Evaluating low-level features for beat classification and tracking. 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP 07. Vol. 4. IEEE, 2007. 3 Boeck, Sebastian, and Gerhard Widmer. Maximum filter vibrato suppression for onset detection. Proceedings of the 16th International Conference on Digital Audio Effects (DAFx-13), Maynooth, Ireland. 2013.

Music Vs Speech Show module

Music Vs Speech Show module A sliding window of sliding windows!

Conclusions Quasiperiodicity (biphonation) is present in nature

Conclusions Quasiperiodicity (biphonation) is present in nature Due to noise/artifacts, sometimes necessary to search around

Conclusions Quasiperiodicity (biphonation) is present in nature Due to noise/artifacts, sometimes necessary to search around Summary features often better than raw data

Conclusions Quasiperiodicity (biphonation) is present in nature Due to noise/artifacts, sometimes necessary to search around Summary features often better than raw data After proper preprocessing, TDA on sliding window embeddings can pick up on rhythmic periodicities in music