Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
|
|
- Samson Ellis
- 5 years ago
- Views:
Transcription
1 Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
2 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o Pitch detection algorithms o Polyphonic context and predominant pitch tracking o Applications in MIR 2
3 Digital audio format: PCM Sampling rate: 44.1 khz, khz Amplitude resolution: 16 bits/sample *The Physics Classroom: phys/class/sound/u11l2a.html WiSSAP 2007
4 Interesting sounds are typically coded in the form of a temporal sequence of atomic sound events. E.g. speech -> a sequence of phones music -> an evolving pattern of notes An atomic sound event, or a single gestalt, can be a complex acoustical signal described by a set of temporal and spectral properties => an evoked sensation. Department of Electrical Engineering, IIT Bombay
5 A sound of given frequency components and sound pressure levels leads to perceived sensations that can be distinguished in terms of: o loudness <-- intensity o pitch <-- fundamental frequency o timbre ( quality or colour ) <--ther spectro-temporal properties Department of Electrical Engineering, IIT Bombay
6 low pitch tone Air pressure variation T0 = 10 msec high pitch tone Frequency = 100 Hz 1 Hertz = 1 vibration/sec Frequency = 300 Hz T0 = 3.3 msec Department of Electrical Engineering, IIT Bombay
7 Musical pitch scale low pitch high pitch semitone = 2 1/12 Department of Electrical Engineering, IIT Bombay
8 o The construction of a musical scale is based on two assumptions about the human hearing process: o The ear is sensitive to ratios of fundamental frequencies (pitches), not so much to absolute pitch. o The preferred musical intervals, i.e. those perceived to be most consonant, are the ratios of small whole numbers. o A musical sound is typically comprised of severalfrequencies. The frequencies are evident if we observe the spectrum of the sound Department of Electrical Engineering, IIT Bombay
9 300 Hz 600 Hz 900 Hz 300 Hz + 600Hz 300 Hz + 600Hz + 900Hz Department of Electrical Engineering, IIT Bombay
10 Sound atoms : Single tone signal x 1 ( t) X 1 ( f ) t(ms) 50 f (Hz)
11 Non-tonal Signal x 2 ( t) X 2 ( f ) t(ms) 50 f (Hz)
12 Complex tone signal x 3 ( t) 0.5 X 3 ( f ) t(ms) 50 f (Hz)
13 Bandpass noise signal x 4 ( t) X 4 ( f ) t(ms) f (Hz)
14 A flute note x 1 ( t) X 1 ( f )db t(ms) f (khz)
15 o We see that the distinctive signal characteristics are more evident in the frequency domain. o The ear is a frequency analyzer. It represents a unique combination of analysis and synthesis => we do not perceive spectral components but rather the composite sounds. o We observe that a single note is perceived as one entity of well-defined subjective sensations. This is due to the spatial pattern recognition process achieved by the central auditory system. 15
16 Major dimensions of music for retrieval are melody, rhythm, harmony and timbre. o Melody, harmony -> based on pitch content o Rhythm -> based on timinginformation o Timbre -> relates to instrumentation, texture A representationof these high-level attributes can be obtained from pitch, timing and spectro-temporal information extracted by audio signal analysis. Representations are then compared via a similarity measure to achieve retrieval. 16
17 o The temporal pattern of frame-level features can offer important cues to signal identity Texture windows Audio signal <= duration: s Analysis windows <= duration: ms Frame-level features Feature vector Feature Extraction Feature summary M. F. Martin and J. Breebaart, "Features for Audio and Music Classification," in Proc.ISMIR,
18 Melody: pitch related feature Melody is the temporal sequence of notes usually played by a single instrument (fixed timbre). The discrete notes (pitches) are typically selected from a musical scale. frequency/note time
19 o Typical implementation: o Pitch detection is carried out on the audio signal at uniformly spaced intervals o The pitch sequence is segmented into notes (regions of relatively steady pitch) o Notes are labeled o Note patterns are matched to determine melodic similarity o Challenges: o Note segmentation can be a difficult task o Pitch detection in polyphonic music is tough 19
20 Monophonic Signal: cues to perceived pitch Spectrum Waveform A. de Cheveigne. Multiple F0 estimation. In D.-L. Wang and G.J. Brown, editors, Computational Auditory Scene Analysis : Principles, Algorithms and Applications, IEEE Press / Wiley, Schroeder histogram PDA Department of Electrical Engineering, IIT Bombay
21 o Time (Lag) domain: maximise autocorrelation value o Frequency domain: minimise error between estimated and predicted harmonic structures o Other 21
22 22
23 Music and speech signals are typically time-varying in nature => a time-frequency representation is required to visualize signal characteristics. The short-time Fourier transform (STFT) affords such a representation based on an assumption of signal quasistationarity. The window shape dictates the time and frequency resolution trade-off. X S m= ( ω, n) = x( m) w( n m) e jωm Department of Electrical Engineering, IIT Bombay
24 w(n-m) x(m) x(m)w(n-m) X( n, ω) DFT 0 ω π
25
26 ai[ t] t Φ[ ] i I[ t] I [ t] ˆ[ x t]= a[ t]cos Φ [ t] + et [ ] i = 1 i -amplitude variation of i th sinusoidal component ( partial ) - total phase (represents both frequency and phase variation) -Number of partials, can vary with time i Φ [ t] = ω[ t] t + ϕ[ t] i i i Model parameters to be estimated: { a i, ω i, ϕ i } l
27 Audio signal x DFT Peak detection Peak tracking Sinusoid parameters { a i, ω i, ϕ i } l Window Additive synthesis _ Tonal component + Σ Residual For the smooth evolution of the signal, sine components are detected in each frameand linked to tracks from the previous frame based on frequency proximity.
28 Spectral magnitude Fixed threshold (MaxPeak - 40 db) Final peaks picked 20 Magnitude (db) Frequency (Hz) Spectral magnitude Envelope - 20 db Envelope - 25 db Envelope - 30 db Magnitude (db) Frequency (Hz)
29 Match spectrum around peak with that of ideal sinusoid. Apply threshold to the error. Department of Electrical Engineering, IIT Bombay
30 Peak tracking sine peak D Frequency track born track dies C B A Time
31 Singer (main melody) Tanpura(drone) Frequency (Hz) Tabla(percussion) Ghe Na Tun Harmonium (secondary melody) Time (sec)
32 o Input : magnitudes+ locations of sinusoids o For a range of trial fundamentals, generate predicted harmonics o MinimiseTWM errorw.r.t. trial fundamentals Predicted Components a Measured Components b Err total Err Err = + ρ N K p m m p Nearest Neighbour Matching Department of Electrical Engineering, IIT Bombay
33 Department of Electrical Engineering, IIT Bombay
34 p E(p,j) W(p,p') E(p',j+1) p Pitch candidates, j Frame (time instant) E Measurement cost (local), W Smoothness cost j Minimize the Global transition cost over the singing spurt Department of Electrical Engineering, IIT Bombay
35 Department of Electrical Engineering, IIT Bombay
36 Polyphonic audio signal Signal representation Multi-F0 analysis Voice F0 contour Singing voice detection Predominant-F0 trajectory extraction
37 37
38 Pitch class profile opitch histogram osimilarity measure involves match between histograms 38
39
40 Positive phrases Negative phrase
41
42 Detects phrases melodically similar to Guru Bina pitch contour Swaras: S SN R Positive phrases Emphatic beat sam Negative phrase
43 43
44 Polyphonic audio signal Signal representation Multi-F0 analysis Voice F0 contour Singing voice detection Predominant-F0 trajectory extraction
45 o Input : magnitudes+ locations of sinusoids o For a range of trial fundamentals, generate predicted harmonics o MinimiseTWM errorw.r.t. trial fundamentals Predicted Components a Measured Components b Err total Err Err = + ρ N K p m m p Nearest Neighbour Matching Department of Electrical Engineering, IIT Bombay
46 Predicted to measured error N p n p Err p m = f n (f n) + ( ) [q f n (f n) r] n= 1 Amax Significant term : Δf / (f) p o Δf = frequency mismatch error of = partial frequency a Measured to predicted error K p k p Err m p = f k (f k) + ( ) [q f k (f k) r] n= 1 Amax a Department of Electrical Engineering, IIT Bombay
47 Melody detection system [1]
48 o F0 search range (male/female) o p, q, r o ρ (male/female) o Window length (pitch range and rate of variation) o Smoothness cost parameter (rate of pitch variation) o Voicing threshold Department of Electrical Engineering, IIT Bombay
49 o Window lengthis an analysis parameter that influences the accuracy of sinusoidal modeling of the signal o Closely-spaced components in the polyphony => need for higher frequency resolution = longer windows o Pitch variation with time can be rapid in ornamented regions => need for better time resolution = shorter windows Department of Electrical Engineering, IIT Bombay
50 o Easily computable measures for adapting window length o Signal sparsity: a sparse spectrum is more concentrated => better represented sinusoidal components o Window length selection (20, 30, 40 ms) based on maximizing signal sparsity
51 1. V. Raoand P. Rao, Vocal melody extraction in the presence of pitched accompaniment in polyphonic music, IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 8, pp , Nov V. Rao, P. Gaddipatiand P. Rao, Signal-driven window adaptation for sinusoid identification in polyphonic music, IEEE Transactions on Audio, Speech, and Language Processing, Jan
Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence
More informationPerception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.
Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions
More informationPerception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,
More informationVIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering
VIBRATO DETECTING ALGORITHM IN REAL TIME Minhao Zhang, Xinzhao Liu University of Rochester Department of Electrical and Computer Engineering ABSTRACT Vibrato is a fundamental expressive attribute in music,
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationTempo and Beat Tracking
Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Introduction Basic beat tracking task: Given an audio recording
More informationStructure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping
Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics
More informationMusic Signal Processing
Tutorial Music Signal Processing Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Anssi Klapuri Queen Mary University of London anssi.klapuri@elec.qmul.ac.uk Overview Part I:
More informationAutomatic Evaluation of Hindustani Learner s SARGAM Practice
Automatic Evaluation of Hindustani Learner s SARGAM Practice Gurunath Reddy M and K. Sreenivasa Rao Indian Institute of Technology, Kharagpur, India {mgurunathreddy, ksrao}@sit.iitkgp.ernet.in Abstract
More informationSinging Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection
Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationTempo and Beat Tracking
Lecture Music Processing Tempo and Beat Tracking Meinard Müller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Müller Fundamentals
More information8.3 Basic Parameters for Audio
8.3 Basic Parameters for Audio Analysis Physical audio signal: simple one-dimensional amplitude = loudness frequency = pitch Psycho-acoustic features: complex A real-life tone arises from a complex superposition
More informationSound Synthesis Methods
Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like
More informationProject 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing
Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You
More informationRhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University
Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004
More informationSound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time.
2. Physical sound 2.1 What is sound? Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. Figure 2.1: A 0.56-second audio clip of
More informationEffects of Reverberation on Pitch, Onset/Offset, and Binaural Cues
Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation
More informationToward Automatic Transcription -- Pitch Tracking In Polyphonic Environment
Toward Automatic Transcription -- Pitch Tracking In Polyphonic Environment Term Project Presentation By: Keerthi C Nagaraj Dated: 30th April 2003 Outline Introduction Background problems in polyphonic
More informationAudio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands
Audio Engineering Society Convention Paper Presented at the th Convention May 5 Amsterdam, The Netherlands This convention paper has been reproduced from the author's advance manuscript, without editing,
More informationTranscription of Piano Music
Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationSound Recognition. ~ CSE 352 Team 3 ~ Jason Park Evan Glover. Kevin Lui Aman Rawat. Prof. Anita Wasilewska
Sound Recognition ~ CSE 352 Team 3 ~ Jason Park Evan Glover Kevin Lui Aman Rawat Prof. Anita Wasilewska What is Sound? Sound is a vibration that propagates as a typically audible mechanical wave of pressure
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationComplex Sounds. Reading: Yost Ch. 4
Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency
More informationAdvanced audio analysis. Martin Gasser
Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high
More informationspeech signal S(n). This involves a transformation of S(n) into another signal or a set of signals
16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationLecture 6: Nonspeech and Music
EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 1 2 3 4 5 Music and nonspeech Environmental sounds Music synthesis techniques Sinewave synthesis Music analysis Dan Ellis
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationSound waves. septembre 2014 Audio signals and systems 1
Sound waves Sound is created by elastic vibrations or oscillations of particles in a particular medium. The vibrations are transmitted from particles to (neighbouring) particles: sound wave. Sound waves
More informationLecture 6: Nonspeech and Music. Music & nonspeech
EE E682: Speech & Audio Processing & Recognition Lecture 6: Nonspeech and Music 2 3 4 5 Music and nonspeech Environmental sounds Music synthesis techniques Sinewave synthesis Music analysis Dan Ellis
More informationREpeating Pattern Extraction Technique (REPET)
REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure
More informationAutomatic Transcription of Monophonic Audio to MIDI
Automatic Transcription of Monophonic Audio to MIDI Jiří Vass 1 and Hadas Ofir 2 1 Czech Technical University in Prague, Faculty of Electrical Engineering Department of Measurement vassj@fel.cvut.cz 2
More informationL19: Prosodic modification of speech
L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at http://www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2009 Vol. 9, No. 1, January-February 2010 The Discrete Fourier Transform, Part 5: Spectrogram
More informationPOLYPHONIC PITCH DETECTION BY MATCHING SPECTRAL AND AUTOCORRELATION PEAKS. Sebastian Kraft, Udo Zölzer
POLYPHONIC PITCH DETECTION BY MATCHING SPECTRAL AND AUTOCORRELATION PEAKS Sebastian Kraft, Udo Zölzer Department of Signal Processing and Communications Helmut-Schmidt-University, Hamburg, Germany sebastian.kraft@hsu-hh.de
More informationMonophony/Polyphony Classification System using Fourier of Fourier Transform
International Journal of Electronics Engineering, 2 (2), 2010, pp. 299 303 Monophony/Polyphony Classification System using Fourier of Fourier Transform Kalyani Akant 1, Rajesh Pande 2, and S.S. Limaye
More informationSpectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma
Spectro-Temporal Methods in Primary Auditory Cortex David Klein Didier Depireux Jonathan Simon Shihab Shamma & Department of Electrical Engineering Supported in part by a MURI grant from the Office of
More informationBasic Characteristics of Speech Signal Analysis
www.ijird.com March, 2016 Vol 5 Issue 4 ISSN 2278 0211 (Online) Basic Characteristics of Speech Signal Analysis S. Poornima Assistant Professor, VlbJanakiammal College of Arts and Science, Coimbatore,
More informationDrum Transcription Based on Independent Subspace Analysis
Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,
More informationBetween physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz
Between physics and perception signal models for high level audio processing Axel Röbel Analysis / synthesis team, IRCAM DAFx 2010 iem Graz Overview Introduction High level control of signal transformation
More informationMUSC 316 Sound & Digital Audio Basics Worksheet
MUSC 316 Sound & Digital Audio Basics Worksheet updated September 2, 2011 Name: An Aggie does not lie, cheat, or steal, or tolerate those who do. By submitting responses for this test you verify, on your
More informationThe psychoacoustics of reverberation
The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control
More informationPsycho-acoustics (Sound characteristics, Masking, and Loudness)
Psycho-acoustics (Sound characteristics, Masking, and Loudness) Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University Mar. 20, 2008 Pure tones Mathematics of the pure
More informationConverting Speaking Voice into Singing Voice
Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech
More informationADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL
ADDITIVE SYNTHESIS BASED ON THE CONTINUOUS WAVELET TRANSFORM: A SINUSOIDAL PLUS TRANSIENT MODEL José R. Beltrán and Fernando Beltrán Department of Electronic Engineering and Communications University of
More informationCOM325 Computer Speech and Hearing
COM325 Computer Speech and Hearing Part III : Theories and Models of Pitch Perception Dr. Guy Brown Room 145 Regent Court Department of Computer Science University of Sheffield Email: g.brown@dcs.shef.ac.uk
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More informationMusical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II
1 Musical Acoustics Lecture 14 Timbre / Tone quality II Odd vs Even Harmonics and Symmetry Sines are Anti-symmetric about mid-point If you mirror around the middle you get the same shape but upside down
More informationSynthesis Techniques. Juan P Bello
Synthesis Techniques Juan P Bello Synthesis It implies the artificial construction of a complex body by combining its elements. Complex body: acoustic signal (sound) Elements: parameters and/or basic signals
More informationTemporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope
Modulating a sinusoid can also work this backwards! Temporal resolution AUDL 4007 carrier (fine structure) x modulator (envelope) = amplitudemodulated wave 1 2 Domain of temporal resolution Fine structure
More informationAn Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation
An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation Aisvarya V 1, Suganthy M 2 PG Student [Comm. Systems], Dept. of ECE, Sree Sastha Institute of Engg. & Tech., Chennai,
More informationAUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)
AUDL GS08/GAV1 Auditory Perception Envelope and temporal fine structure (TFS) Envelope and TFS arise from a method of decomposing waveforms The classic decomposition of waveforms Spectral analysis... Decomposes
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationMUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES. P.S. Lampropoulou, A.S. Lampropoulos and G.A.
MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES P.S. Lampropoulou, A.S. Lampropoulos and G.A. Tsihrintzis Department of Informatics, University of Piraeus 80 Karaoli & Dimitriou
More informationBEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor
BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient
More informationECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2
ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre
More informationLinguistic Phonetics. Spectral Analysis
24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/
More informationINFLUENCE OF FREQUENCY DISTRIBUTION ON INTENSITY FLUCTUATIONS OF NOISE
INFLUENCE OF FREQUENCY DISTRIBUTION ON INTENSITY FLUCTUATIONS OF NOISE Pierre HANNA SCRIME - LaBRI Université de Bordeaux 1 F-33405 Talence Cedex, France hanna@labriu-bordeauxfr Myriam DESAINTE-CATHERINE
More informationA Parametric Model for Spectral Sound Synthesis of Musical Sounds
A Parametric Model for Spectral Sound Synthesis of Musical Sounds Cornelia Kreutzer University of Limerick ECE Department Limerick, Ireland cornelia.kreutzer@ul.ie Jacqueline Walker University of Limerick
More informationMusic 171: Sinusoids. Tamara Smyth, Department of Music, University of California, San Diego (UCSD) January 10, 2019
Music 7: Sinusoids Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD) January 0, 209 What is Sound? The word sound is used to describe both:. an auditory sensation
More informationReading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.
L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are
More informationComputer Audio. An Overview. (Material freely adapted from sources far too numerous to mention )
Computer Audio An Overview (Material freely adapted from sources far too numerous to mention ) Computer Audio An interdisciplinary field including Music Computer Science Electrical Engineering (signal
More informationAspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta
Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied
More informationTRANSFORMS / WAVELETS
RANSFORMS / WAVELES ransform Analysis Signal processing using a transform analysis for calculations is a technique used to simplify or accelerate problem solution. For example, instead of dividing two
More informationResearch on Extracting BPM Feature Values in Music Beat Tracking Algorithm
Research on Extracting BPM Feature Values in Music Beat Tracking Algorithm Yan Zhao * Hainan Tropical Ocean University, Sanya, China *Corresponding author(e-mail: yanzhao16@163.com) Abstract With the rapid
More informationLecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)
Lecture 6 Rhythm Analysis (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller) Definitions for Rhythm Analysis Rhythm: movement marked by the regulated succession of strong
More informationSpeech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065
Speech Processing Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 All course materials and handouts are the same for both versions. Differences: credits (20 for UG, 10 for PG);
More informationVOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL
VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL Narsimh Kamath Vishweshwara Rao Preeti Rao NIT Karnataka EE Dept, IIT-Bombay EE Dept, IIT-Bombay narsimh@gmail.com vishu@ee.iitb.ac.in
More informationSPEECH AND SPECTRAL ANALYSIS
SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs
More informationROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS
ROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS Anssi Klapuri 1, Tuomas Virtanen 1, Jan-Markus Holm 2 1 Tampere University of Technology, Signal Processing
More informationSPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester
SPEECH TO SINGING SYNTHESIS SYSTEM Mingqing Yun, Yoon mo Yang, Yufei Zhang Department of Electrical and Computer Engineering University of Rochester ABSTRACT This paper describes a speech-to-singing synthesis
More informationMULTIPLE F0 ESTIMATION IN THE TRANSFORM DOMAIN
10th International Society for Music Information Retrieval Conference (ISMIR 2009 MULTIPLE F0 ESTIMATION IN THE TRANSFORM DOMAIN Christopher A. Santoro +* Corey I. Cheng *# + LSB Audio Tampa, FL 33610
More informationMusical Acoustics, C. Bertulani. Musical Acoustics. Lecture 13 Timbre / Tone quality I
1 Musical Acoustics Lecture 13 Timbre / Tone quality I Waves: review 2 distance x (m) At a given time t: y = A sin(2πx/λ) A -A time t (s) At a given position x: y = A sin(2πt/t) Perfect Tuning Fork: Pure
More informationSAMPLING THEORY. Representing continuous signals with discrete numbers
SAMPLING THEORY Representing continuous signals with discrete numbers Roger B. Dannenberg Professor of Computer Science, Art, and Music Carnegie Mellon University ICM Week 3 Copyright 2002-2013 by Roger
More informationSGN Audio and Speech Processing
SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although
More informationAcoustics, signals & systems for audiology. Week 9. Basic Psychoacoustic Phenomena: Temporal resolution
Acoustics, signals & systems for audiology Week 9 Basic Psychoacoustic Phenomena: Temporal resolution Modulating a sinusoid carrier at 1 khz (fine structure) x modulator at 100 Hz (envelope) = amplitudemodulated
More informationSpeech and Music Discrimination based on Signal Modulation Spectrum.
Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we
More informationFFT analysis in practice
FFT analysis in practice Perception & Multimedia Computing Lecture 13 Rebecca Fiebrink Lecturer, Department of Computing Goldsmiths, University of London 1 Last Week Review of complex numbers: rectangular
More informationAUDL GS08/GAV1 Signals, systems, acoustics and the ear. Loudness & Temporal resolution
AUDL GS08/GAV1 Signals, systems, acoustics and the ear Loudness & Temporal resolution Absolute thresholds & Loudness Name some ways these concepts are crucial to audiologists Sivian & White (1933) JASA
More informationI D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008
R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath
More informationPitch Estimation of Singing Voice From Monaural Popular Music Recordings
Pitch Estimation of Singing Voice From Monaural Popular Music Recordings Kwan Kim, Jun Hee Lee New York University author names in alphabetical order Abstract A singing voice separation system is a hard
More informationMODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS
MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,
More informationALTERNATING CURRENT (AC)
ALL ABOUT NOISE ALTERNATING CURRENT (AC) Any type of electrical transmission where the current repeatedly changes direction, and the voltage varies between maxima and minima. Therefore, any electrical
More informationSignal segmentation and waveform characterization. Biosignal processing, S Autumn 2012
Signal segmentation and waveform characterization Biosignal processing, 5173S Autumn 01 Short-time analysis of signals Signal statistics may vary in time: nonstationary how to compute signal characterizations?
More informationUniversity of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015
University of Colorado at Boulder ECEN 4/5532 Lab 1 Lab report due on February 2, 2015 This is a MATLAB only lab, and therefore each student needs to turn in her/his own lab report and own programs. 1
More informationScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech
More informationWhat is Sound? Part II
What is Sound? Part II Timbre & Noise 1 Prayouandi (2010) - OneOhtrix Point Never PSYCHOACOUSTICS ACOUSTICS LOUDNESS AMPLITUDE PITCH FREQUENCY QUALITY TIMBRE 2 Timbre / Quality everything that is not frequency
More informationEnvelope Modulation Spectrum (EMS)
Envelope Modulation Spectrum (EMS) The Envelope Modulation Spectrum (EMS) is a representation of the slow amplitude modulations in a signal and the distribution of energy in the amplitude fluctuations
More informationMUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting
MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA)
More informationDistortion products and the perceived pitch of harmonic complex tones
Distortion products and the perceived pitch of harmonic complex tones D. Pressnitzer and R.D. Patterson Centre for the Neural Basis of Hearing, Dept. of Physiology, Downing street, Cambridge CB2 3EG, U.K.
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationSingle-channel Mixture Decomposition using Bayesian Harmonic Models
Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,
More information