Signal Processing Algorithms for Music, Marine Mammals and Speech
|
|
- Eustace Farmer
- 6 years ago
- Views:
Transcription
1 for, for, University of Crete, Computer Science Dept., Multimedia Informatics Lab AUTH 2008 June 23rd
2 for,
3 Based on Dynamic Periodicity Warping for, In collaboration with: Andre Holzapfel It was presented at ICASSP 2008, Las Vegas : [1]
4 What is used for? for, Organize your huge collection of songs according to their rhythm. Help ethnomusicologists to categorize and reveal musical structure of field recordings from some country.
5 Approaches to the problem for, Beat spectra, cosine measure (J. Foote et al., 2002) [2] Tempo based spectra (G. Peeters, 2005) [3] Tactus based patterns (J. Paulus et al., 2002) [4] We suggest the use of continuous periodicity spectra and a warping strategy to cope with large variations in tempo.
6 Approaches to the problem for, Beat spectra, cosine measure (J. Foote et al., 2002) [2] Tempo based spectra (G. Peeters, 2005) [3] Tactus based patterns (J. Paulus et al., 2002) [4] We suggest the use of continuous periodicity spectra and a warping strategy to cope with large variations in tempo.
7 Periodicity Spectra for, Computation of onset strength signal, p(t) (D. Ellis, MIREX2006, beat tracking contest 1 ) Modeling of p(t) p(t) = N e i (t) δ(t kt ) k Ki i=1 Periodicity Spectra: N P(f ) = 1 T E i(f ) δ(f k T ) i=1 k Ki where f < 1000bpm (16.7Hz) 1 Beat Tracking Results
8 Periodicity Spectra for, Computation of onset strength signal, p(t) (D. Ellis, MIREX2006, beat tracking contest 1 ) Modeling of p(t) p(t) = N e i (t) δ(t kt ) k Ki i=1 Periodicity Spectra: N P(f ) = 1 T E i(f ) δ(f k T ) i=1 k Ki where f < 1000bpm (16.7Hz) 1 Beat Tracking Results
9 Periodicity Spectra for, Computation of onset strength signal, p(t) (D. Ellis, MIREX2006, beat tracking contest 1 ) Modeling of p(t) p(t) = N e i (t) δ(t kt ) k Ki i=1 Periodicity Spectra: N P(f ) = 1 T E i(f ) δ(f k T ) i=1 k Ki where f < 1000bpm (16.7Hz) 1 Beat Tracking Results
10 Example of periodicity spectra for, bpm bpm Two examples of periodicity spectra of Siganos dance: Upper panel is a faster example of that in the lower panel. Window length is 8s.
11 Rhythm similarity based on Dynamic Periodicity Warping (DPW) for, P 1 (f) P 2 (f) NORM SIM S ρ REFLINE DP w DP W PROJ Σ d DP W
12 Example of DPW computation for,
13 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
14 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
15 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
16 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
17 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
18 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
19 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
20 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
21 Databases and baseline Distances for, Databases: D1: 698 songs from eight classes of ballroom dances D2: 90 songs from six classes of Cretan dances Baseline Distances Cosine distance (inner product) Euclidean distance Cost of warping, (d Cost ) (J. Paulus et al., 2002)[4] Cosine distance after warping, d CosPost Our measure: d DPW
22 More on Cretan dances database (D2) for, Table: Tempi of D2 and Listeners accuracy Dance Tempo Range ( ) Listeners acc. (%) Kalamatianos Siganos Maleviziotis Pentozalis Sousta Chaniotis Mean 75.6
23 Results on D1: Ballroom dances for, Table: Classification Accuracies on D1 wknn knn Cosine 85.5 (k=7) 84.5 (k=3) Euclidean 83.8 (k=6) 82.7 (k=3) d Cost 72.4 (k=14) 70.7 (k=7) d CosPost 70.7 (k=32) 69.2 (k=17) d DPW 82.1 (k=11) 80.9 (k=20) 10 repetitions of 10-fold stratified cross-validation
24 Results on D2: Cretan dances for, Table: Classification Accuracies on D2 wknn knn Cosine 53.8 (k=1) 53.8 (k=1) Euclidean 48.9 (k=1) 48.8 (k=1) d Cost 51.8 (k=18) 48.5 (k=8) d CosPost 51.1 (k=19) 48.7 (k=12) d DPW 69.0 (k=4) 64.4 (k=5) 10 repetitions of 10-fold stratified cross-validation
25 for, detection using the Teager-Kaiser operator and Phase Spectra In collaboration with: Varvara Kandia Presented at: ECS 2008 (The Netherlands), 3rd Workshop on Detection and Classification of Mammals, Boston nd Workshop on Detection and Classification of Mammals, Monaco 2006 :[5][6][7]
26 Why to do it? for, Localization and tracking with passive acoustics Study animal behavior Abundance estimation Correlations with physiology (size of animals, sound production mechanism)
27 Examples of clicks from Sperm whales for, Regular clicks: (a) Amplitude Creak clicks: Amplitude Time in ms (a) Time in ms
28 Examples of clicks from Beaked whales for,
29 Approaches/Softwares for click detection for, Rainbow click (D. Gillespie, 1997)[8] Moby click (O. Jäke, 1996)[9] Ishmael (D. Mellinger, 2001)[10]
30 Teager-Kaiser energy operator[5][6] for, Definition for a discrete time signal Ψ[s(n)] = s 2 (n) s(n + 1)s(n 1) For a signal with 3 components: interference x[n], transient y[n], and noise u[n], so s[n] = x[n] + y[n] + u[n]: Ψ[s(n)] = Ψ[x(n)] + Ψ[y(n)] + Ψ[u(n)] + T [n] we may show that: Ψ[s(n)] Ψ[y(n)] + w(n)
31 Teager-Kaiser energy operator[5][6] for, Definition for a discrete time signal Ψ[s(n)] = s 2 (n) s(n + 1)s(n 1) For a signal with 3 components: interference x[n], transient y[n], and noise u[n], so s[n] = x[n] + y[n] + u[n]: Ψ[s(n)] = Ψ[x(n)] + Ψ[y(n)] + Ψ[u(n)] + T [n] we may show that: Ψ[s(n)] Ψ[y(n)] + w(n)
32 Synthetic example for, Amplitude (a) Amplitude Time (ms) (b) Time (ms)
33 Applied on clicks for, From Sperm whales, Regular clicks: (a) Raw file, (b) after TK Amplitude Amplitude Time (ms) (b) (a) Time (ms)
34 Applied on clicks for, From Sperm whales, Creak clicks: (a) Raw file, (b) after TK Amplitude Time (ms) (b) 1 Amplitude (a) Time (ms)
35 Comparison with Rainbow click for, Det. Score: = Correctly detected hand labeled clicks Total hand labeled clicks 100 Table: Percentage (%) of correctly identified clicks per file. Tolerance of 2ms. TK RB File name clicks score (%) clicks score (%) clicks F1 266 (0) F2 944 (549) F3 689 (414) F4 529 (242) F5 435 (155)
36 In terms of ROC curves for, Detection Rate (%) Approximate ROC Tolerance (ms) TK RB
37 Phase Spectrum[7] for, Group delay: or where: τ(ω) = dφ(ω) dω τ(ω) = X R(ω)Y R (ω) + X I (ω)y I (ω) X (ω) 2 X (ω) = F(x[n]) = X R (ω) + jx I (ω) Y (ω) = F(nx[n]) = Y R (ω) + jy I (ω)
38 Motivation for,
39 Motivation for,
40 Application on the Beaked whales example for,
41 Note: Triangles denote hand labels Zoom on in an area of clicks for, After applying an appropriate modulation and low-pass filtering to the original recordings.
42 Results on Beaked and Sperm Whales for, Raw data/with TK Species clicks Det (%) Corr (%) MAE (ms) Beaked Whales / / /0.9 Sperm Whales / / /0.97 Det = Number of clicks correctly detected Total 100 Corr = Total Deleted Inserted Total 100
43 A Mathematical Model for Accurate Measurement of Jitter for, In collaboration with: Miltiadis Vasilakis It was presented at MAVEBA 2007, Florence : [11]
44 Jitter for, Definition Jitter is defined as perturbations of the glottal source signal that occur during vowel phonation and affect the glottal pitch period.
45 Definitions for, Let u[n] be the pitch period sequence. Local Absolute 1 N 1 N 1 n=1 1 N u(n + 1) u(n) n=1 N u(n) N 1 1 u(n + 1) u(n) N 1 n=1 Relative average Perturbation 1 N 2 N 2 n=1 1 N 2u(n+1) u(n) u(n+2) 3 n=1 N u(n)
46 Definitions for, Let u[n] be the pitch period sequence. Local Absolute 1 N 1 N 1 n=1 1 N u(n + 1) u(n) n=1 N u(n) N 1 1 u(n + 1) u(n) N 1 n=1 Relative average Perturbation 1 N 2 N 2 n=1 1 N 2u(n+1) u(n) u(n+2) 3 n=1 N u(n)
47 Definitions for, Let u[n] be the pitch period sequence. Local Absolute 1 N 1 N 1 n=1 1 N u(n + 1) u(n) n=1 N u(n) N 1 1 u(n + 1) u(n) N 1 n=1 Relative average Perturbation 1 N 2 N 2 n=1 1 N 2u(n+1) u(n) u(n+2) 3 n=1 N u(n)
48 Our approach for, 1 P ε P + ε P ε P + ε amplitude time (samples)
49 In mathematical terms for, We model the glottal impulse train as: p[n] = + δ[n (2k)P] + + k= k= δ[n + ɛ (2k + 1)P] We may show that its power spectrum is then: P(ω) 2 = H(ɛ, ω) + S(ɛ, ω)
50 In mathematical terms for, We model the glottal impulse train as: p[n] = + δ[n (2k)P] + + k= k= δ[n + ɛ (2k + 1)P] We may show that its power spectrum is then: P(ω) 2 = H(ɛ, ω) + S(ɛ, ω)
51 Examples of power spectrum for, On synthetic glottal signal power (db) H(0, ω) S(0, ω) H(1, ω) S(1, ω) H(2, ω) S(2, ω) 40 radian frequency (ω)
52 Examples of power spectrum for, power (db) power (db) frequency (khz) harmonic & subharmonic parts of the power spectrum synthetic signal (fs = 48kHz, ε = 5): power spectrum of a single frame the circles indicate crossings between the harmonic and subharmonic parts frequency (khz) H(ε, ω) S(ε, ω) power (db) P(ω) 2 a closer look at the first crossing accepted crossing rejected crossings frequency (khz)
53 Experiments for, Goal: discriminate pathological from normal voices, based on Database: Massachusetts Eye and Ear Infirmary (MEEI) [12] Sustained vowels, 53 subjects with normal voice, 657 subjects with a wide variety of pathological conditions Jitter estimation methods: PRAAT2007 (P. Boersma and D. Weenink) [13] Multi-Dimensional Voice Program (MDVP), (Kay-Pentax elemetrics, 2007) [14] Our approach [11]
54 Experiments for, Goal: discriminate pathological from normal voices, based on Database: Massachusetts Eye and Ear Infirmary (MEEI) [12] Sustained vowels, 53 subjects with normal voice, 657 subjects with a wide variety of pathological conditions Jitter estimation methods: PRAAT2007 (P. Boersma and D. Weenink) [13] Multi-Dimensional Voice Program (MDVP), (Kay-Pentax elemetrics, 2007) [14] Our approach [11]
55 Results in ROC curves for, True Positive Rate MDVP Jita Proposed method, fixed frame, sequence average Proposed method, variable frame, sequence average Praat Jitter (local, absolute) False Positive Rate
56 for, A. Holzapfel and Y.. similarity of music based on dynamic periodicity warping. In IEEE ICASSP Jonathan Foote, Matthew D. Cooper, and Unjung Nam. Audio retrieval by rhythmic similarity. In Proc. of ISMIR rd International Conference on Information Retrieval, Geoffroy Peeters. Rhythm classification using spectral rhythm patterns. In Proc. of ISMIR th International Conference on Information Retrieval, pages , Jouni Paulus and A.P. Klapuri. the similarity of rhythmic patterns. In Proc. of ISMIR rd International Conference on Information Retrieval, V. Kandia and Y.. Detection of creak clicks of sperm whales in low SNR conditions. In CD Proc. IEEE Oceans, Brest, France, V. Kandia and Y.. Detection of sperm whale clicks based on the Teager-Kaiser energy operator. Applied Acoustics, 67(11-12): , V. Kandia and Y.. Detection of clicks based on group delay. Accepted in Canadian Acoustics, D. Gillespie.
57 for, An acoustic survey for sperm whales in the Southern Ocean sanctuary conducted from the R/V Aurora Australis. Rep. Int. Whal. Comm., 47: , O. Jäke. Acoustic Censusing of sperm whales at Kaikoura, New Zealand: An inexpensive method to count clicks and whales automatically. Master Thesis, University of Otago, Dunedin, New Zealand, D. K. Mellinger. Ishmael 1.0 Users Guide. NOAA, NOAA/PMEL/OERD, 2115 SE OSU Drive, Newport, OR , Technical Memorandum OAR PMEL-120. M. Vasilakis and Y.. A mathematical model for accurate measurement of. In MAVEBA 2007, Florence, Italy, Kay Elemetrics. Disordered Voice Database (Version 1.03), Paul Boersma and David Weenink. Praat: doing phonetics by computer (Version ) [Computer program], Kay Elemetrics. Multi-Dimensional Voice Program (MDVP) [Computer program], 2007.
58 for,
Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University
Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004
More informationVarvara Kandia1 and Yannis Stylianou Institute of Computer Science, FORTH, Crete, Greece 2- Computer Science Dept. University of Crete, Greece
Research article / Article de recherche D e t e c t io n of C lic k s ba sed o n G r o u p D ela y Varvara Kandia1 and Yannis Stylianou1 2 1- Institute of Computer Science, FORTH, Crete, Greece 2- Computer
More informationLecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)
Lecture 6 Rhythm Analysis (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller) Definitions for Rhythm Analysis Rhythm: movement marked by the regulated succession of strong
More informationRhythm Analysis in Music
Rhythm Analysis in Music EECS 352: Machine Perception of Music & Audio Zafar Rafii, Winter 24 Some Definitions Rhythm movement marked by the regulated succession of strong and weak elements, or of opposite
More informationRhythm Analysis in Music
Rhythm Analysis in Music EECS 352: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Some Definitions Rhythm movement marked by the regulated succession of strong and weak elements, or of opposite
More informationBEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor
BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient
More informationResearch Article Jitter Estimation Algorithms for Detection of Pathological Voices
Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 29, Article ID 567875, 9 pages doi:1.1155/29/567875 Research Article Jitter Estimation Algorithms for Detection of
More informationTempo and Beat Tracking
Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Introduction Basic beat tracking task: Given an audio recording
More informationMusic Signal Processing
Tutorial Music Signal Processing Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Anssi Klapuri Queen Mary University of London anssi.klapuri@elec.qmul.ac.uk Overview Part I:
More informationA SEGMENTATION-BASED TEMPO INDUCTION METHOD
A SEGMENTATION-BASED TEMPO INDUCTION METHOD Maxime Le Coz, Helene Lachambre, Lionel Koenig and Regine Andre-Obrecht IRIT, Universite Paul Sabatier, 118 Route de Narbonne, F-31062 TOULOUSE CEDEX 9 {lecoz,lachambre,koenig,obrecht}@irit.fr
More informationTempo and Beat Tracking
Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals
More informationCity, University of London Institutional Repository
City Research Online City, University of London Institutional Repository Citation: Benetos, E., Holzapfel, A. & Stylianou, Y. (29). Pitched Instrument Onset Detection based on Auditory Spectra. Paper presented
More informationEnvelope Modulation Spectrum (EMS)
Envelope Modulation Spectrum (EMS) The Envelope Modulation Spectrum (EMS) is a representation of the slow amplitude modulations in a signal and the distribution of energy in the amplitude fluctuations
More informationDrum Transcription Based on Independent Subspace Analysis
Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationScienceDirect. Accuracy of Jitter and Shimmer Measurements
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 16 (2014 ) 1190 1199 CENTERIS 2014 - Conference on ENTERprise Information Systems / ProjMAN 2014 - International Conference on
More informationLinguistic Phonetics. Spectral Analysis
24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There
More informationProject 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing
Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You
More informationNovel Temporal and Spectral Features Derived from TEO for Classification of Normal and Dysphonic Voices
Novel Temporal and Spectral Features Derived from TEO for Classification of Normal and Dysphonic Voices Hemant A.Patil 1, Pallavi N. Baljekar T. K. Basu 3 1 Dhirubhai Ambani Institute of Information and
More informationAcoustic Tremor Measurement: Comparing Two Systems
Acoustic Tremor Measurement: Comparing Two Systems Markus Brückl Elvira Ibragimova Silke Bögelein Institute for Language and Communication Technische Universität Berlin 10 th International Workshop on
More informationAdvances in Speech Signal Processing for Voice Quality Assessment
Processing for Part II University of Crete, Computer Science Dept., Multimedia Informatics Lab yannis@csd.uoc.gr Bilbao, 2011 September 1 Multi-linear Algebra Features selection 2 Introduction Application:
More informationResearch on Extracting BPM Feature Values in Music Beat Tracking Algorithm
Research on Extracting BPM Feature Values in Music Beat Tracking Algorithm Yan Zhao * Hainan Tropical Ocean University, Sanya, China *Corresponding author(e-mail: yanzhao16@163.com) Abstract With the rapid
More informationAudio processing methods on marine mammal vocalizations
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio http://labrosa.ee.columbia.edu Sound to Signal sound is pressure
More informationRhythm Analysis in Music
Rhythm Analysis in Music EECS 352: Machine Percep;on of Music & Audio Zafar Rafii, Winter 24 Some Defini;ons Rhythm movement marked by the regulated succession of strong and weak elements, or of opposite
More informationThe Effects of Noise on Acoustic Parameters
The Effects of Noise on Acoustic Parameters * 1 Turgut Özseven and 2 Muharrem Düğenci 1 Turhal Vocational School, Gaziosmanpaşa University, Turkey * 2 Faculty of Engineering, Department of Industrial Engineering
More informationAspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta
Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied
More informationThe Passive Aquatic Listener (PAL): An Adaptive Sampling Passive Acoustic Recorder
The Passive Aquatic Listener (PAL): An Adaptive Sampling Passive Acoustic Recorder Jennifer L. Miksis Olds Applied Research Laboratory, The Pennsylvania State University Jeffrey A. Nystuen Applied Physics
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationOnset Detection Revisited
simon.dixon@ofai.at Austrian Research Institute for Artificial Intelligence Vienna, Austria 9th International Conference on Digital Audio Effects Outline Background and Motivation 1 Background and Motivation
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION. Frank Kurth, Alessia Cornaggia-Urrigshardt and Sebastian Urrigshardt
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) ROBUST F0 ESTIMATION IN NOISY SPEECH SIGNALS USING SHIFT AUTOCORRELATION Frank Kurth, Alessia Cornaggia-Urrigshardt
More informationLecture 5: Pitch and Chord (1) Chord Recognition. Li Su
Lecture 5: Pitch and Chord (1) Chord Recognition Li Su Recap: short-time Fourier transform Given a discrete-time signal x(t) sampled at a rate f s. Let window size N samples, hop size H samples, then the
More informationPassive Localization of Multiple Sources Using Widely-Spaced Arrays With Application to Marine Mammals
Passive Localization of Multiple Sources Using Widely-Spaced Arrays With Application to Marine Mammals L. Neil Frazer School of Ocean and Earth Science and Technology University of Hawaii at Manoa 1680
More informationSignal Characterization in terms of Sinusoidal and Non-Sinusoidal Components
Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components Geoffroy Peeters, avier Rodet To cite this version: Geoffroy Peeters, avier Rodet. Signal Characterization in terms of Sinusoidal
More informationPROBLEM SET 5. Reminder: Quiz 1will be on March 6, during the regular class hour. Details to follow. z = e jω h[n] H(e jω ) H(z) DTFT.
PROBLEM SET 5 Issued: 2/4/9 Due: 2/22/9 Reading: During the past week we continued our discussion of the impact of pole/zero locations on frequency response, focusing on allpass systems, minimum and maximum-phase
More informationINTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)
INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Proceedings of the 2 nd International Conference on Current Trends in Engineering and Management ICCTEM -214 ISSN
More informationAnalysis on Extraction of Modulated Signal Using Adaptive Filtering Algorithms against Ambient Noises in Underwater Communication
International Journal of Signal Processing Systems Vol., No., June 5 Analysis on Extraction of Modulated Signal Using Adaptive Filtering Algorithms against Ambient Noises in Underwater Communication S.
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationAcoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018
1 Acoustics and Fourier Transform Physics 3600 - Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018 I. INTRODUCTION Time is fundamental in our everyday life in the 4-dimensional
More informationPassive Acoustic Monitoring for Marine Mammals at Site C in Jacksonville, FL, February August 2014
Passive Acoustic Monitoring for Marine Mammals at Site C in Jacksonville, FL, February August 2014 A Summary of Work Performed by Amanda J. Debich, Simone Baumann- Pickering, Ana Širović, John A. Hildebrand,
More informationAcoustic Phonetics. Chapter 8
Acoustic Phonetics Chapter 8 1 1. Sound waves Vocal folds/cords: Frequency: 300 Hz 0 0 0.01 0.02 0.03 2 1.1 Sound waves: The parts of waves We will be considering the parts of a wave with the wave represented
More informationB.Tech III Year II Semester (R13) Regular & Supplementary Examinations May/June 2017 DIGITAL SIGNAL PROCESSING (Common to ECE and EIE)
Code: 13A04602 R13 B.Tech III Year II Semester (R13) Regular & Supplementary Examinations May/June 2017 (Common to ECE and EIE) PART A (Compulsory Question) 1 Answer the following: (10 X 02 = 20 Marks)
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationWIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY
INTER-NOISE 216 WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY Shumpei SAKAI 1 ; Tetsuro MURAKAMI 2 ; Naoto SAKATA 3 ; Hirohumi NAKAJIMA 4 ; Kazuhiro NAKADAI
More informationExploring the effect of rhythmic style classification on automatic tempo estimation
Exploring the effect of rhythmic style classification on automatic tempo estimation Matthew E. P. Davies and Mark D. Plumbley Centre for Digital Music, Queen Mary, University of London Mile End Rd, E1
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationFall Music 320A Homework #2 Sinusoids, Complex Sinusoids 145 points Theory and Lab Problems Due Thursday 10/11/2018 before class
Fall 2018 2019 Music 320A Homework #2 Sinusoids, Complex Sinusoids 145 points Theory and Lab Problems Due Thursday 10/11/2018 before class Theory Problems 1. 15 pts) [Sinusoids] Define xt) as xt) = 2sin
More informationREAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO
Proc. of the th Int. Conference on Digital Audio Effects (DAFx-9), Como, Italy, September -, 9 REAL-TIME BEAT-SYNCHRONOUS ANALYSIS OF MUSICAL AUDIO Adam M. Stark, Matthew E. P. Davies and Mark D. Plumbley
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationPassive Localization of Multiple Sources Using Widely-Spaced Arrays with Application to Marine Mammals
Passive Localization of Multiple Sources Using Widely-Spaced Arrays with Application to Marine Mammals L. Neil Frazer School of Ocean and Earth Science and Technology University of Hawaii at Manoa 1680
More informationLecture 3: Audio Applications
Jose Perea, Michigan State University. Chris Tralie, Duke University 7/20/2016 Table of Contents Audio Data / Biphonation Music Data Digital Audio Basics: Representation/Sampling 1D time series x[n], sampled
More informationSUB-BAND INDEPENDENT SUBSPACE ANALYSIS FOR DRUM TRANSCRIPTION. Derry FitzGerald, Eugene Coyle
SUB-BAND INDEPENDEN SUBSPACE ANALYSIS FOR DRUM RANSCRIPION Derry FitzGerald, Eugene Coyle D.I.., Rathmines Rd, Dublin, Ireland derryfitzgerald@dit.ie eugene.coyle@dit.ie Bob Lawlor Department of Electronic
More informationhttp://www.diva-portal.org This is the published version of a paper presented at 17th International Society for Music Information Retrieval Conference (ISMIR 2016); New York City, USA, 7-11 August, 2016..
More informationNon-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase and Reassignment
Non-stationary Analysis/Synthesis using Spectrum Peak Shape Distortion, Phase Reassignment Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou, Analysis/Synthesis Team, 1, pl. Igor Stravinsky,
More informationCetacean Density Estimation from Novel Acoustic Datasets by Acoustic Propagation Modeling
DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Cetacean Density Estimation from Novel Acoustic Datasets by Acoustic Propagation Modeling Martin Siderius and Elizabeth
More informationUniversity of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015
University of Colorado at Boulder ECEN 4/5532 Lab 1 Lab report due on February 2, 2015 This is a MATLAB only lab, and therefore each student needs to turn in her/his own lab report and own programs. 1
More informationCONCURRENT ESTIMATION OF CHORDS AND KEYS FROM AUDIO
CONCURRENT ESTIMATION OF CHORDS AND KEYS FROM AUDIO Thomas Rocher, Matthias Robine, Pierre Hanna LaBRI, University of Bordeaux 351 cours de la Libration 33405 Talence Cedex, France {rocher,robine,hanna}@labri.fr
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationA Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification
A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department
More informationIntroducing COVAREP: A collaborative voice analysis repository for speech technologies
Introducing COVAREP: A collaborative voice analysis repository for speech technologies John Kane Wednesday November 27th, 2013 SIGMEDIA-group TCD COVAREP - Open-source speech processing repository 1 Introduction
More informationCORRELATIONS BETWEEN SPEAKER'S BODY SIZE AND ACOUSTIC PARAMETERS OF VOICE 1, 2
CORRELATIONS BETWEEN SPEAKER'S BODY SIZE AND ACOUSTIC PARAMETERS OF VOICE 1, 2 JULIO GONZÁLEZ University Jaume I of Castellón, Spain Running Head: SPEAKER BODY SIZE AND VOICE PARAMETERS 1 Address correspondence
More informationVOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL
VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL Narsimh Kamath Vishweshwara Rao Preeti Rao NIT Karnataka EE Dept, IIT-Bombay EE Dept, IIT-Bombay narsimh@gmail.com vishu@ee.iitb.ac.in
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationChapter 12. Preview. Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect. Section 1 Sound Waves
Section 1 Sound Waves Preview Objectives The Production of Sound Waves Frequency of Sound Waves The Doppler Effect Section 1 Sound Waves Objectives Explain how sound waves are produced. Relate frequency
More informationMultirate Digital Signal Processing
Multirate Digital Signal Processing Basic Sampling Rate Alteration Devices Up-sampler - Used to increase the sampling rate by an integer factor Down-sampler - Used to increase the sampling rate by an integer
More information8A. ANALYSIS OF COMPLEX SOUNDS. Amplitude, loudness, and decibels
8A. ANALYSIS OF COMPLEX SOUNDS Amplitude, loudness, and decibels Last week we found that we could synthesize complex sounds with a particular frequency, f, by adding together sine waves from the harmonic
More informationComplex Sounds. Reading: Yost Ch. 4
Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency
More informationELECTROMYOGRAPHY UNIT-4
ELECTROMYOGRAPHY UNIT-4 INTRODUCTION EMG is the study of muscle electrical signals. EMG is sometimes referred to as myoelectric activity. Muscle tissue conducts electrical potentials similar to the way
More informationEVALUATION OF MFCC ESTIMATION TECHNIQUES FOR MUSIC SIMILARITY
EVALUATION OF MFCC ESTIMATION TECHNIQUES FOR MUSIC SIMILARITY Jesper Højvang Jensen 1, Mads Græsbøll Christensen 1, Manohar N. Murthi, and Søren Holdt Jensen 1 1 Department of Communication Technology,
More informationProject 2 - Speech Detection with FIR Filters
Project 2 - Speech Detection with FIR Filters ECE505, Fall 2015 EECS, University of Tennessee (Due 10/30) 1 Objective The project introduces a practical application where sinusoidal signals are used to
More informationMUSC 316 Sound & Digital Audio Basics Worksheet
MUSC 316 Sound & Digital Audio Basics Worksheet updated September 2, 2011 Name: An Aggie does not lie, cheat, or steal, or tolerate those who do. By submitting responses for this test you verify, on your
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationSub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech
Sub-band Envelope Approach to Obtain Instants of Significant Excitation in Speech Vikram Ramesh Lakkavalli, K V Vijay Girish, A G Ramakrishnan Medical Intelligence and Language Engineering (MILE) Laboratory
More informationAUTOMATIC CHORD TRANSCRIPTION WITH CONCURRENT RECOGNITION OF CHORD SYMBOLS AND BOUNDARIES
AUTOMATIC CHORD TRANSCRIPTION WITH CONCURRENT RECOGNITION OF CHORD SYMBOLS AND BOUNDARIES Takuya Yoshioka, Tetsuro Kitahara, Kazunori Komatani, Tetsuya Ogata, and Hiroshi G. Okuno Graduate School of Informatics,
More informationSPEECH AND SPECTRAL ANALYSIS
SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and
More informationDeep learning architectures for music audio classification: a personal (re)view
Deep learning architectures for music audio classification: a personal (re)view Jordi Pons jordipons.me @jordiponsdotme Music Technology Group Universitat Pompeu Fabra, Barcelona Acronyms MLP: multi layer
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationSGN Audio and Speech Processing
SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although
More informationUniversity of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005
University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis
More informationTranscription of Piano Music
Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk
More informationPerturbation analysis using a moving window for disordered voices JiYeoun Lee, Seong Hee Choi
Perturbation analysis using a moving window for disordered voices JiYeoun Lee, Seong Hee Choi Abstract Voices from patients with voice disordered tend to be less periodic and contain larger perturbations.
More informationEE 422G - Signals and Systems Laboratory
EE 422G - Signals and Systems Laboratory Lab 3 FIR Filters Written by Kevin D. Donohue Department of Electrical and Computer Engineering University of Kentucky Lexington, KY 40506 September 19, 2015 Objectives:
More informationPitch Detection Algorithms
OpenStax-CNX module: m11714 1 Pitch Detection Algorithms Gareth Middleton This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 1.0 Abstract Two algorithms to
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationApplication of Affine Projection Algorithm in Adaptive Noise Cancellation
ISSN: 78-8 Vol. 3 Issue, January - Application of Affine Projection Algorithm in Adaptive Noise Cancellation Rajul Goyal Dr. Girish Parmar Pankaj Shukla EC Deptt.,DTE Jodhpur EC Deptt., RTU Kota EC Deptt.,
More informationRobust Voice Activity Detection Based on Discrete Wavelet. Transform
Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper
More informationSignal Processing. Introduction
Signal Processing 0 Introduction One of the premiere uses of MATLAB is in the analysis of signal processing and control systems. In this chapter we consider signal processing. The final chapter of the
More informationContinuous vs. Discrete signals. Sampling. Analog to Digital Conversion. CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals
Continuous vs. Discrete signals CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 22,
More informationIndoor Location Detection
Indoor Location Detection Arezou Pourmir Abstract: This project is a classification problem and tries to distinguish some specific places from each other. We use the acoustic waves sent from the speaker
More informationMonophony/Polyphony Classification System using Fourier of Fourier Transform
International Journal of Electronics Engineering, 2 (2), 2010, pp. 299 303 Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye
More informationKalman Tracking and Bayesian Detection for Radar RFI Blanking
Kalman Tracking and Bayesian Detection for Radar RFI Blanking Weizhen Dong, Brian D. Jeffs Department of Electrical and Computer Engineering Brigham Young University J. Richard Fisher National Radio Astronomy
More informationCHARACTERIZATION OF PATHOLOGICAL VOICE SIGNALS BASED ON CLASSICAL ACOUSTIC ANALYSIS
CHARACTERIZATION OF PATHOLOGICAL VOICE SIGNALS BASED ON CLASSICAL ACOUSTIC ANALYSIS Robert Rice Brandt 1, Benedito Guimarães Aguiar Neto 2, Raimundo Carlos Silvério Freire 3, Joseana Macedo Fechine 4,
More informationSurvey Paper on Music Beat Tracking
Survey Paper on Music Beat Tracking Vedshree Panchwadkar, Shravani Pande, Prof.Mr.Makarand Velankar Cummins College of Engg, Pune, India vedshreepd@gmail.com, shravni.pande@gmail.com, makarand_v@rediffmail.com
More informationSound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code
IEICE TRANS. INF. & SYST., VOL.E98 D, NO.1 JANUARY 2015 89 LETTER Special Section on Enriched Multimedia Sound Quality Evaluation for Audio Watermarking Based on Phase Shift Keying Using BCH Code Harumi
More informationMPEG-4 Structured Audio Systems
MPEG-4 Structured Audio Systems Mihir Anandpara The University of Texas at Austin anandpar@ece.utexas.edu 1 Abstract The MPEG-4 standard has been proposed to provide high quality audio and video content
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationCOMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner. University of Rochester
COMPUTATIONAL RHYTHM AND BEAT ANALYSIS Nicholas Berkner University of Rochester ABSTRACT One of the most important applications in the field of music information processing is beat finding. Humans have
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More information