Audio Content Analysis. Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly

Similar documents
SGN Audio and Speech Processing

SGN Audio and Speech Processing

Fundamentals of Music Technology

Music Signal Processing

Tempo and Beat Tracking

Rhythm Analysis in Music

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

Rhythm Analysis in Music

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Tempo and Beat Tracking

Lecture 5: Pitch and Chord (1) Chord Recognition. Li Su

INFLUENCE OF PEAK SELECTION METHODS ON ONSET DETECTION

Energy-Weighted Multi-Band Novelty Functions for Onset Detection in Piano Music

The psychoacoustics of reverberation

ENSC327/328 Communication Systems Course Information. Paul Ho Professor School of Engineering Science Simon Fraser University

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

Dominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation

Change Point Determination in Audio Data Using Auditory Features

Transcription of Piano Music

CHORD DETECTION USING CHROMAGRAM OPTIMIZED BY EXTRACTING ADDITIONAL FEATURES

Applications of Music Processing

Lecture # 01. Introduction

EENG 479 Digital signal processing Dr. Mohab A. Mangoud

Rhythm Analysis in Music

MULTIPLE F0 ESTIMATION IN THE TRANSFORM DOMAIN

Lecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)

Research on Extracting BPM Feature Values in Music Beat Tracking Algorithm

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

Lecture 14: Source Separation

Advanced audio analysis. Martin Gasser

Drum Transcription Based on Independent Subspace Analysis

Harmonic Percussive Source Separation

ECE Digital Signal Processing

MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES. P.S. Lampropoulou, A.S. Lampropoulos and G.A.

Single-channel Mixture Decomposition using Bayesian Harmonic Models

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO

An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Announcements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.

8.3 Basic Parameters for Audio

ECEn 487 Digital Signal Processing Laboratory. Lab 3 FFT-based Spectrum Analyzer

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

LAB 2 Machine Perception of Music Computer Science 395, Winter Quarter 2005

Pitch Estimation of Singing Voice From Monaural Popular Music Recordings

REpeating Pattern Extraction Technique (REPET)

A Parametric Model for Spectral Sound Synthesis of Musical Sounds

Speech/Music Change Point Detection using Sonogram and AANN

Audio processing methods on marine mammal vocalizations

Isolated Digit Recognition Using MFCC AND DTW

A Full-Band Adaptive Harmonic Representation of Speech

Harmonic-Percussive Source Separation of Polyphonic Music by Suppressing Impulsive Noise Events

APPROXIMATE NOTE TRANSCRIPTION FOR THE IMPROVED IDENTIFICATION OF DIFFICULT CHORDS

Between physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz

Electrical and Telecommunication Engineering Technology NEW YORK CITY COLLEGE OF TECHNOLOGY THE CITY UNIVERSITY OF NEW YORK

Automatic Guitar Chord Recognition

Topic. Spectrogram Chromagram Cesptrogram. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

JOURNAL OF OBJECT TECHNOLOGY

Introduction Image Analysis & Computer Vision. Guido Gerig CS/BIOEN 6640 FALL 2012

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

CS 102: Big Data Tools and Techniques Discoveries and Pitfalls. Spring 2018

Lecture 3: Audio Applications

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

Lab 3 FFT based Spectrum Analyzer

E C E S I G N A L S A N D S Y S T E M S. ECE 2221 Signals and Systems, Sem /2011, Dr. Sigit Jarot

EE 309 Signal and Linear System Analysis

CONCURRENT ESTIMATION OF CHORDS AND KEYS FROM AUDIO

NCCF ACF. cepstrum coef. error signal > samples

Ornithology BIO 426 (W/O2) (Spring 2013; CRN 33963) (tentative, version 26th January 2013)

AUTOMATIC CHORD TRANSCRIPTION WITH CONCURRENT RECOGNITION OF CHORD SYMBOLS AND BOUNDARIES

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Automatic Transcription of Monophonic Audio to MIDI

Deep learning architectures for music audio classification: a personal (re)view

IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey

University of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015

FEATURE ADAPTED CONVOLUTIONAL NEURAL NETWORKS FOR DOWNBEAT TRACKING

Classification of vocalizations of killer whales using dynamic time warping

Subband Analysis of Time Delay Estimation in STFT Domain

ELE 882: Introduction to Digital Image Processing (DIP)

Lecture 9: Time & Pitch Scaling

Audio Restoration Based on DSP Tools

Monophony/Polyphony Classification System using Fourier of Fourier Transform

Speech Recognition using FIR Wiener Filter

Syllabus for ENGR065-01: Circuit Theory

An Optimization of Audio Classification and Segmentation using GASOM Algorithm

I-Hao Hsiao, Chun-Tang Chao*, and Chi-Jo Wang (2016). A HHT-Based Music Synthesizer. Intelligent Technologies and Engineering Systems, Lecture Notes

CHORD RECOGNITION USING INSTRUMENT VOICING CONSTRAINTS

FFT analysis in practice

Feasibility of Vocal Emotion Conversion on Modulation Spectrogram for Simulated Cochlear Implants

AMUSIC signal can be considered as a succession of musical

Audio Imputation Using the Non-negative Hidden Markov Model

TCET3202 Analog and digital Communications II

Keywords: spectral centroid, MPEG-7, sum of sine waves, band limited impulse train, STFT, peak detection.

DIGITAL IMAGE PROCESSING

DEEP LEARNING FOR MUSIC RECOMMENDATION:

SINUSOIDAL MODELING. EE6641 Analysis and Synthesis of Audio Signals. Yi-Wen Liu Nov 3, 2015

A MULTI-RESOLUTION APPROACH TO COMMON FATE-BASED AUDIO SEPARATION

Transcription:

Audio Content Analysis Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly

Juan Pablo Bello Office: Room 626, 6th floor, 35 W 4th Street (ext. 85736) Office Hours: Tuesdays 2-5pm email: jpbello@nyu.edu Personal webpage: https://wp.nyu.edu/jpbello/ This course: http://www.nyu.edu/classes/bello/aca.html

Audio Content Analysis Research, development and application of systems and techniques intended for the automatic analysis and understanding of sounds, in other words, the development of listening machines. Grounded in the combined use of theories, concepts and methods from signal processing, computer science, acoustics (psycho-, bio-, -ecology), cognition, speech science, and music. Sounds: speech, music, environmental sound Audio Signal Processing? Computational Auditory Scene Analysis? Computer Audition? Machine Listening?

For example... Histogram Periodogram Novelty Function Spectrogram nature, bird, woodpecker Orca whale, mating call voice, male, stressed speech, female, newscast music, breakbeat, fast Brit-pop, drum Audio Signal

Applications (a few examples)

Applications (a few examples)

Applications (a few examples)

Resources IEEE: http://www.icassp2014.org/home.html, http://www.waspaa.com/, http://www.asru2013.org/, http://www.signalprocessingsociety.org/technicalcommittees/list/audio-tc/, http://www.signalprocessingsociety.org/ publications/periodicals/ ISCA: http://www.isca-speech.org/, http://www.interspeech2013.org/, http://www.journals.elsevier.com/speech-communication AES: http://www.aes.org/events/conventions/, http://www.aes.org/events/ conferences/, http://www.aes.org/journal/ ASA: http://acousticalsociety.org/meetings, http://asadl.org/jasa/ EURASIP: http://www.eurasip.org/index.php, http://www.eusipco2013.org/ ISMIR: http://www.ismir.net/, http://www.ismir.net/all-papers.html Others: http://www.smc-conference.org/, http://www.dafx.de/

Calendar: Lectures Week 1-2 Fundamentals, and time-frequency representations Week 3-4 Novelty: onset detection Week 5-6 Periodicity: pitch detection and beat tracking Week 7-8 Timbre: low-level features and spectral envelope Week 9-10 Pitch distribution: chroma, chord and key recognition Week 11-12 Sound classification

Assessment Assignments: 40% (4 x 10% each): announced in class/website, due a week after posting, penalties will apply to delays of up to 20 hours. Mid-term exam: 30% (best 3 out of 4 questions), on 03.29 Projects: 30% (groups of 2) Proposal (04.12): 5% Final project + presentation (05.10): 25% Class Participation: extra points (attendance, questions, discussions, interest)

Calendar: Important dates Spring 2017 03.15 - Spring break 04.12 - Project proposals 03.29 - Mid-term exam 05.10 - Final project submission and presentation

Tutoring/Resources TA: TBD USE THE OFFICE HOURS (Tuesdays 2-5pm) All relevant information is (or will be published) on the class website - Please read it carefully and keep checking for updates. http://www.nyu.edu/classes/bello/aca.html

Recommended Reading Wang, D. and Brown, G. "Computational Auditory Scene Analysis". John Wiley & Sons (2006) Müller, M. Fundamentals of Music Processing: Audio, Analysis, Algorithms and Applications. Springer (2015) Lerch, A. An Introduction to Audio Content Analysis. John Wiley & Sons (2012) Gold, B., Morgan, N., and Ellis, D. Speech and Audio Signal Processing. 2nd edition, Wiley (2011) Klapuri, A. and Davy, M. (Eds.) Signal Processing Methods for Music Transcription. Springer (2006) Smith, J.O. Mathematics of the Discrete Fourier Transform (DFT). 2nd Edition, W3K Publishing (2007) Witten, I. and Frank, E. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005) Further reading will be recommended as the course progresses.

To do INSTALL MATLAB ASAP! Matlab documentation, tutorials, examples: www.mathworks.com/access/ helpdesk/help/techdoc/matlab.html Signal Processing Toolbox documentation, tutorials, examples: www.mathworks.com/access/helpdesk/help/toolbox/signal/ Matlab file exchange: www.mathworks.com/matlabcentral/fileexchange/ loadcategory.do START LOOKING FOR PROJECT TOPIC: Visit resource links, talk to current members of the MARL-MIR group (meets Tuesdays 10am, 6th floor conference room, 35 W 4th Street), Attend relevant seminars (most Thursdays @ 1pm).