Lecture 14: Source Separation
|
|
- Gwen Quinn
- 6 years ago
- Views:
Transcription
1 ELEN E896 MUSIC SIGNAL PROCESSING Lecture 1: Source Separation 1. Sources, Mixtures, & Perception. Spatial Filtering 3. Time-Frequency Masking. Model-Based Separation Dan Ellis Dept. Electrical Engineering, Columbia University E896 Music Signal Processing (Dan Ellis) /19
2 1. Sources, Mixtures, & Perception Sound is a linear process (superposition) no opacity (unlike vision) sources auditory scenes (polyphony) frq/hz time/s level / db _m+s-15-evil-goodvoice-fade Analysis Voice (evil) Rumble Stab Voice (pleasant) Strings Choir Humans perceive discrete sources.. a subjective construct E896 Music Signal Processing (Dan Ellis) /19
3 Spatial Hearing People perceive sources based on cues spatial (binaural): ITD, ILD Blauert 96 R L head shadow (high freq) source path length difference shatr78m3 waveform.1.5 Left -.5 Right time / s E896 Music Signal Processing (Dan Ellis) /19
4 Auditory Scene Analysis Spatial cues may not be enough/available single channel signal Brain uses signal-intrinsic cues to form sources onset, harmonicity Bregman 9 Reynolds-McAdams Oboe time / sec level / db E896 Music Signal Processing (Dan Ellis) /19
5 Auditory Scene Analysis Imagine two narrow channels dug up from the edge of a lake, with handkerchiefs stretched across each one. Looking only at the motion of the handkerchiefs, you are to answer questions such as: How many boats are there on the lake and where are they? (after Bregman 9) Quite a challenge! E896 Music Signal Processing (Dan Ellis) /19
6 Audio Mixing Studio recording combines separate tracks into, e.g., channels (stereo) different levels panning other effects Stereo Intensity Panning manipulating ILD only constant power more channels: use just nearest pair? E896 Music Signal Processing (Dan Ellis) / L R
7 . Spatial Filtering N sources detected by M sensors degrees of freedom (else need other constraints) Consider x case: directional mics m 1 s 1 a a 1 11 a a 1 mixing matrix: m 1 = a 11 a 1 s 1 ŝ 1 m a 1 a s ŝ m s = Â 1 m E896 Music Signal Processing (Dan Ellis) /19
8 Source Cancelation Simple x case example: m 1 m = m 1 (t) s 1 s m 1 (t) =s 1 (t)+.5s (t) m (t) =.8s 1 (t)+s (t).5m (t) =.6s 1 (t) if no delay and linearly-independent sums, can cancel one source per combination E896 Music Signal Processing (Dan Ellis) /19
9 Independent Component Analysis Can separate blind combinations by maximizing independence of outputs Bell & Sejnowski 95 m 1 a 11 a s x 1 1 m a 1 a s δ MutInfo δa kurtosis kurt(y) =E y µ 3 for independence? mix.8.6. s1 Mixture Scatter s kurtosis Kurtosis vs mix E896 Music Signal Processing (Dan Ellis) /19 s1 s /
10 Microphone Arrays If interference is diffuse, can simply boost energy from target direction e.g. shotgun mic - delay-and-sum Benesty, Chen, Huang 8 λ = D x = c. D - λ = D - λ = D + D + D + D off-axis spectral coloration many variants - filter & sum, sidelobe cancelation... E896 Music Signal Processing (Dan Ellis) /19
11 3. Time-Frequency Masking What if there is only one channel? cannot have fixed cancellation but could have fast time-varying filtering: 8 6 Brown & Cooke 9 Roweis The trick is finding the right mask... E896 Music Signal Processing (Dan Ellis) /19 time / s
12 Original Mix + Oracle Labels Time-Frequency Masking Works well for overlapping voices 8 6 Male Female - level / db Oraclebased Resynth time / sec time-frequency resolution? time / sec cooke-v3n7.wav cooke-v3msk-ideal.wav cooke-n7msk-ideal.wav E896 Music Signal Processing (Dan Ellis) /19
13 Pan-Based Filtering Can use time-frequency masking even for stereo e.g. calculate panning index as ILD mask cells matching that ILD Avendano 3 6 level / db 5 6 ILD mask 1 pt win db time / s E896 Music Signal Processing (Dan Ellis) /19 5 level / db ILD / db
14 Harmonic-based Masking Time-frequency masking can be used to pick out harmonics given pitch track, know where to expect harmonics Denbigh & Zhao time / s E896 Music Signal Processing (Dan Ellis) /19
15 Harmonic Filtering Given pitch track, could use time-varying comb filter to get harmonics or: isolate each harmonic by heterodyning: 3 ˆx(t) = k Avery Wang 1995 â k (t)cos(kˆ(t)t) â k (t) =LP F { x(t)e jkˆ(t)t } time / s 8 E896 Music Signal Processing (Dan Ellis) /19
16 Nonnegative Matrix Factorization Decomposition of spectrograms into templates + activation X = W H fast & forgiving gradient descent algorithm fits neatly with time-frequency masking Lee & Seung 99 Abdallah & Plumbley Smaragdis & Brown 3 Virtanen 7 Virtanen 3 sounds 1 3 Bases from all W t Rows of H Time (DFT slices) E896 Music Signal Processing (Dan Ellis) / Frequency (DFT index) Smaragdis
17 . Model-Based Separation When N (sources) > M (sensors), need additional constraints to solve problem e.g. assumption of single dominant pitch Can assemble into a model M of source si defines set of possible waveforms..probabilistically: Pr(s i M) Source separation from mixture as inference: s = {s i } = arg max s where P r(x s,a)p (A) Pr(x s,a)=n (x As, ) i Pr(s i M) E896 Music Signal Processing (Dan Ellis) /19
18 Can constrain: Source Models source spectra (e.g. harmonic, noisy, smooth) temporal evolution (piecewise-continuous) spatial arrangements (point-source, diffuse) Factored decomposition: Ozerov, Vincent & Bimbot 1 Stereo instantaneous mix Separated source Frequency 1 Frequency Time Time Separated source 3 3 Frequency 1 Frequency Time Time Separated source 3 3 Frequency Time Music: Shannon Hurley / Mix: Michel Desnoues & Alexey Ozerov / Separations: Alexey Ozerov E896 Music Signal Processing (Dan Ellis) /19
19 Summary Acoustic Source Mixtures The normal situation in real-world sounds Spatial filtering Canceling sources by subtracting channels Time-Frequency Masking Selecting spectrogram cells Model-Based Separation Exploiting regularities in source signals E896 Music Signal Processing (Dan Ellis) /19
20 References S. Abdallah & M. Plumbley, Polyphonic transcription by non-negative sparse coding of power spectra, Proc. Int. Symp. Music Info. Retrieval,. C. Avendano, Frequency-domain source identification and manipulation in stereo mixes for enhancement, suppression and re-panning applications, IEEE WASPAA, Mohonk, pp , 3. A. Bell, T. Sejnowski, An information-maximization approach to blind separation and blind deconvolution, Neural Computation, vol. 7 no. 6, pp , J. Benesty, J. Chen, Y. Huang, Microphone Array Signal Processing, Springer, 8. J. Blauert, Spatial Hearing, MIT Press, A. Bregman, Auditory Scene Analysis, MIT Press, 199. G. Brown & M. Cooke, Computational auditory scene analysis, Computer Speech and Language, vol. 8 no., pp , 199. P. Denbigh & J. Zhao, Pitch extraction and separation of overlapping speech, Speech Communication, vol. 11 no. -3, pp , 199. D. Lee & S. Seung, Learning the Parts of Objects by Non-negative Matrix Factorization, Nature 1, 788, A. Ozerov, E. Vincent, & F. Bimbot, A general flexible framework for the handling of prior information in audio source separation, INRIA Tech. Rep. 753, Nov. 1. S. Roweis, One microphone source separation, Adv. Neural Info. Proc. Sys., pp , 1. P. Smaragdis & J. Brown, Non-negative Matrix Factorization for Polyphonic Music Transcription, Proc. IEEE WASPAA,177-18, October, 3. T. Virtanen Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria, IEEE Tr. Audio, Speech, & Lang. Proc. 15(3), , 7. Avery Wang, Instantaneous and frequency-warped signal processing techniques for auditory source separation, Ph.D. dissertation, Stanford CCRMA, E896 Music Signal Processing (Dan Ellis) /19
Monaural and Binaural Speech Separation
Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as
More informationSingle-channel Mixture Decomposition using Bayesian Harmonic Models
Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,
More informationREpeating Pattern Extraction Technique (REPET)
REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure
More informationLecture 5: Sinusoidal Modeling
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering,
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationEffects of Reverberation on Pitch, Onset/Offset, and Binaural Cues
Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation
More informationVQ Source Models: Perceptual & Phase Issues
VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu
More informationRecent Advances in Acoustic Signal Extraction and Dereverberation
Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing
More informationImproving reverberant speech separation with binaural cues using temporal context and convolutional neural networks
Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,
More informationLecture 9: Time & Pitch Scaling
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 9: Time & Pitch Scaling 1. Time Scale Modification (TSM) 2. Time-Domain Approaches 3. The Phase Vocoder 4. Sinusoidal Approach Dan Ellis Dept. Electrical Engineering,
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationThe psychoacoustics of reverberation
The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control
More informationDrum Transcription Based on Independent Subspace Analysis
Report for EE 391 Special Studies and Reports for Electrical Engineering Drum Transcription Based on Independent Subspace Analysis Yinyi Guo Center for Computer Research in Music and Acoustics, Stanford,
More informationDiscriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks
Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks Emad M. Grais, Gerard Roma, Andrew J.R. Simpson, and Mark D. Plumbley Centre for Vision, Speech and Signal
More informationHigh-speed Noise Cancellation with Microphone Array
Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent
More informationIMPROVED COCKTAIL-PARTY PROCESSING
IMPROVED COCKTAIL-PARTY PROCESSING Alexis Favrot, Markus Erne Scopein Research Aarau, Switzerland postmaster@scopein.ch Christof Faller Audiovisual Communications Laboratory, LCAV Swiss Institute of Technology
More informationAn Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation
An Efficient Extraction of Vocal Portion from Music Accompaniment Using Trend Estimation Aisvarya V 1, Suganthy M 2 PG Student [Comm. Systems], Dept. of ECE, Sree Sastha Institute of Engg. & Tech., Chennai,
More informationBinaural Hearing. Reading: Yost Ch. 12
Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to
More informationCombining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music
Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music Tuomas Virtanen, Annamaria Mesaros, Matti Ryynänen Department of Signal Processing,
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationSGN Audio and Speech Processing
SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although
More informationMicrophone Array Design and Beamforming
Microphone Array Design and Beamforming Heinrich Löllmann Multimedia Communications and Signal Processing heinrich.loellmann@fau.de with contributions from Vladi Tourbabin and Hendrik Barfuss EUSIPCO Tutorial
More informationAudio Imputation Using the Non-negative Hidden Markov Model
Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.
More informationMINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE
MINUET: MUSICAL INTERFERENCE UNMIXING ESTIMATION TECHNIQUE Scott Rickard, Conor Fearon University College Dublin, Dublin, Ireland {scott.rickard,conor.fearon}@ee.ucd.ie Radu Balan, Justinian Rosca Siemens
More informationROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES
ROOM AND CONCERT HALL ACOUSTICS The perception of sound by human listeners in a listening space, such as a room or a concert hall is a complicated function of the type of source sound (speech, oration,
More informationPRIMARY-AMBIENT SOURCE SEPARATION FOR UPMIXING TO SURROUND SOUND SYSTEMS
PRIMARY-AMBIENT SOURCE SEPARATION FOR UPMIXING TO SURROUND SOUND SYSTEMS Karim M. Ibrahim National University of Singapore karim.ibrahim@comp.nus.edu.sg Mahmoud Allam Nile University mallam@nu.edu.eg ABSTRACT
More informationRobust Low-Resource Sound Localization in Correlated Noise
INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem
More informationRobust Speech Recognition Group Carnegie Mellon University. Telephone: Fax:
Robust Automatic Speech Recognition In the 21 st Century Richard Stern (with Alex Acero, Yu-Hsiang Chiu, Evandro Gouvêa, Chanwoo Kim, Kshitiz Kumar, Amir Moghimi, Pedro Moreno, Hyung-Min Park, Bhiksha
More information8.3 Basic Parameters for Audio
8.3 Basic Parameters for Audio Analysis Physical audio signal: simple one-dimensional amplitude = loudness frequency = pitch Psycho-acoustic features: complex A real-life tone arises from a complex superposition
More information1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER /$ IEEE
1856 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 7, SEPTEMBER 2010 Sequential Organization of Speech in Reverberant Environments by Integrating Monaural Grouping and Binaural
More informationEmanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor. Presented by Amir Kiperwas
Emanuël A. P. Habets, Jacob Benesty, and Patrick A. Naylor Presented by Amir Kiperwas 1 M-element microphone array One desired source One undesired source Ambient noise field Signals: Broadband Mutually
More informationSpeech and Audio Processing Recognition and Audio Effects Part 3: Beamforming
Speech and Audio Processing Recognition and Audio Effects Part 3: Beamforming Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Engineering
More informationStefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH
State of art and Challenges in Improving Speech Intelligibility in Hearing Impaired People Stefan Launer, Lyon, January 2011 Phonak AG, Stäfa, CH Content Phonak Stefan Launer, Speech in Noise Workshop,
More informationA Novel Approach to Separation of Musical Signal Sources by NMF
ICSP2014 Proceedings A Novel Approach to Separation of Musical Signal Sources by NMF Sakurako Yazawa Graduate School of Systems and Information Engineering, University of Tsukuba, Japan Masatoshi Hamanaka
More informationTwo-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling
Two-channel Separation of Speech Using Direction-of-arrival Estimation And Sinusoids Plus Transients Modeling Mikko Parviainen 1 and Tuomas Virtanen 2 Institute of Signal Processing Tampere University
More informationSinging Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection
Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationConvention Paper Presented at the 120th Convention 2006 May Paris, France
Audio Engineering Society Convention Paper Presented at the 12th Convention 26 May 2 23 Paris, France This convention paper has been reproduced from the author s advance manuscript, without editing, corrections,
More informationSUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES
SUPERVISED SIGNAL PROCESSING FOR SEPARATION AND INDEPENDENT GAIN CONTROL OF DIFFERENT PERCUSSION INSTRUMENTS USING A LIMITED NUMBER OF MICROPHONES SF Minhas A Barton P Gaydecki School of Electrical and
More informationSeparating Voiced Segments from Music File using MFCC, ZCR and GMM
Separating Voiced Segments from Music File using MFCC, ZCR and GMM Mr. Prashant P. Zirmite 1, Mr. Mahesh K. Patil 2, Mr. Santosh P. Salgar 3,Mr. Veeresh M. Metigoudar 4 1,2,3,4Assistant Professor, Dept.
More informationDominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation
Dominant Voiced Speech Segregation Using Onset Offset Detection and IBM Based Segmentation Shibani.H 1, Lekshmi M S 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala,
More informationTowards an intelligent binaural spee enhancement system by integrating me signal extraction. Author(s)Chau, Duc Thanh; Li, Junfeng; Akagi,
JAIST Reposi https://dspace.j Title Towards an intelligent binaural spee enhancement system by integrating me signal extraction Author(s)Chau, Duc Thanh; Li, Junfeng; Akagi, Citation 2011 International
More informationScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech
More informationAuditory System For a Mobile Robot
Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca Motivations
More informationProceedings of Meetings on Acoustics
Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 1pAAa: Advanced Analysis of Room Acoustics:
More informationRobotic Spatial Sound Localization and Its 3-D Sound Human Interface
Robotic Spatial Sound Localization and Its 3-D Sound Human Interface Jie Huang, Katsunori Kume, Akira Saji, Masahiro Nishihashi, Teppei Watanabe and William L. Martens The University of Aizu Aizu-Wakamatsu,
More informationSpeech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 4 Issue 4 April 2015, Page No. 11143-11147 Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya
More informationReal-time Adaptive Concepts in Acoustics
Real-time Adaptive Concepts in Acoustics Real-time Adaptive Concepts in Acoustics Blind Signal Separation and Multichannel Echo Cancellation by Daniel W.E. Schobben, Ph. D. Philips Research Laboratories
More informationONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT
ONLINE REPET-SIM FOR REAL-TIME SPEECH ENHANCEMENT Zafar Rafii Northwestern University EECS Department Evanston, IL, USA Bryan Pardo Northwestern University EECS Department Evanston, IL, USA ABSTRACT REPET-SIM
More informationIN a natural environment, speech often occurs simultaneously. Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 15, NO. 5, SEPTEMBER 2004 1135 Monaural Speech Segregation Based on Pitch Tracking and Amplitude Modulation Guoning Hu and DeLiang Wang, Fellow, IEEE Abstract
More informationAN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES
Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications
More informationWARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS
NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS Helsinki University of Technology Laboratory of Acoustics and Audio
More informationPitch Estimation of Singing Voice From Monaural Popular Music Recordings
Pitch Estimation of Singing Voice From Monaural Popular Music Recordings Kwan Kim, Jun Hee Lee New York University author names in alphabetical order Abstract A singing voice separation system is a hard
More informationPerception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence
More informationMultiple Sound Sources Localization Using Energetic Analysis Method
VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova
More informationMUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES. P.S. Lampropoulou, A.S. Lampropoulos and G.A.
MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES P.S. Lampropoulou, A.S. Lampropoulos and G.A. Tsihrintzis Department of Informatics, University of Piraeus 80 Karaoli & Dimitriou
More informationPerception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.
Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationA CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL
9th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 7 A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL PACS: PACS:. Pn Nicolas Le Goff ; Armin Kohlrausch ; Jeroen
More informationStudy Of Sound Source Localization Using Music Method In Real Acoustic Environment
International Journal of Electronics Engineering Research. ISSN 975-645 Volume 9, Number 4 (27) pp. 545-556 Research India Publications http://www.ripublication.com Study Of Sound Source Localization Using
More informationTranscription of Piano Music
Transcription of Piano Music Rudolf BRISUDA Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 2, 842 16 Bratislava, Slovakia xbrisuda@is.stuba.sk
More informationAudio Restoration Based on DSP Tools
Audio Restoration Based on DSP Tools EECS 451 Final Project Report Nan Wu School of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI, United States wunan@umich.edu Abstract
More informationWIND NOISE REDUCTION USING NON-NEGATIVE SPARSE CODING
WIND NOISE REDUCTION USING NON-NEGATIVE SPARSE CODING Mikkel N. Schmidt, Jan Larsen Technical University of Denmark Informatics and Mathematical Modelling Richard Petersens Plads, Building 31 Kgs. Lyngby
More informationSound Synthesis Methods
Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like
More informationROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE
- @ Ramon E Prieto et al Robust Pitch Tracking ROUST PITCH TRACKIN USIN LINEAR RERESSION OF THE PHASE Ramon E Prieto, Sora Kim 2 Electrical Engineering Department, Stanford University, rprieto@stanfordedu
More informationMikko Myllymäki and Tuomas Virtanen
NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,
More informationComplex Sounds. Reading: Yost Ch. 4
Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency
More informationGuitar Music Transcription from Silent Video. Temporal Segmentation - Implementation Details
Supplementary Material Guitar Music Transcription from Silent Video Shir Goldstein, Yael Moses For completeness, we present detailed results and analysis of tests presented in the paper, as well as implementation
More informationCalibration of Microphone Arrays for Improved Speech Recognition
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Calibration of Microphone Arrays for Improved Speech Recognition Michael L. Seltzer, Bhiksha Raj TR-2001-43 December 2001 Abstract We present
More informationLecture 2: Acoustics
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 2: Acoustics 1. Acoustics, Sound & the Wave Equation 2. Musical Oscillations 3. The Digital Waveguide Dan Ellis Dept. Electrical Engineering, Columbia University
More informationEE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.
EE1.el3 (EEE1023): Electronics III Acoustics lecture 20 Sound localisation Dr Philip Jackson www.ee.surrey.ac.uk/teaching/courses/ee1.el3 Sound localisation Objectives: calculate frequency response of
More informationROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS
ROBUST MULTIPITCH ESTIMATION FOR THE ANALYSIS AND MANIPULATION OF POLYPHONIC MUSICAL SIGNALS Anssi Klapuri 1, Tuomas Virtanen 1, Jan-Markus Holm 2 1 Tampere University of Technology, Signal Processing
More informationORCHIVE: Digitizing and Analyzing Orca Vocalizations
ORCHIVE: Digitizing and Analyzing Orca Vocalizations George Tzanetakis & Mathieu Lagrange Department of Computer Science University of Victoria, Canada {gtzan, lagrange}@uvic.ca Paul Spong & Helena Symonds
More informationSPARSITY LEVEL IN A NON-NEGATIVE MATRIX FACTORIZATION BASED SPEECH STRATEGY IN COCHLEAR IMPLANTS
th European Signal Processing Conference (EUSIPCO ) Bucharest, Romania, August 7-3, SPARSITY LEVEL IN A NON-NEGATIVE MATRIX FACTORIZATION BASED SPEECH STRATEGY IN COCHLEAR IMPLANTS Hongmei Hu,, Nasser
More informationComputational Perception. Sound localization 2
Computational Perception 15-485/785 January 22, 2008 Sound localization 2 Last lecture sound propagation: reflection, diffraction, shadowing sound intensity (db) defining computational problems sound lateralization
More informationROTATIONAL RESET STRATEGY FOR ONLINE SEMI-SUPERVISED NMF-BASED SPEECH ENHANCEMENT FOR LONG RECORDINGS
ROTATIONAL RESET STRATEGY FOR ONLINE SEMI-SUPERVISED NMF-BASED SPEECH ENHANCEMENT FOR LONG RECORDINGS Jun Zhou Southwest University Dept. of Computer Science Beibei, Chongqing 47, China zhouj@swu.edu.cn
More informationROBUST SPEECH RECOGNITION. Richard Stern
ROBUST SPEECH RECOGNITION Richard Stern Robust Speech Recognition Group Mellon University Telephone: (412) 268-2535 Fax: (412) 268-3890 rms@cs.cmu.edu http://www.cs.cmu.edu/~rms Short Course at Universidad
More informationMUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting
MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA)
More informationDeep learning architectures for music audio classification: a personal (re)view
Deep learning architectures for music audio classification: a personal (re)view Jordi Pons jordipons.me @jordiponsdotme Music Technology Group Universitat Pompeu Fabra, Barcelona Acronyms MLP: multi layer
More informationAuditory Distance Perception. Yan-Chen Lu & Martin Cooke
Auditory Distance Perception Yan-Chen Lu & Martin Cooke Human auditory distance perception Human performance data (21 studies, 84 data sets) can be modelled by a power function r =kr a (Zahorik et al.
More informationBinaural Segregation in Multisource Reverberant Environments
T e c h n i c a l R e p o r t O S U - C I S R C - 9 / 0 5 - T R 6 0 D e p a r t m e n t o f C o m p u t e r S c i e n c e a n d E n g i n e e r i n g T h e O h i o S t a t e U n i v e r s i t y C o l u
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationICA for Musical Signal Separation
ICA for Musical Signal Separation Alex Favaro Aaron Lewis Garrett Schlesinger 1 Introduction When recording large musical groups it is often desirable to record the entire group at once with separate microphones
More informationSpatialization and Timbre for Effective Auditory Graphing
18 Proceedings o1't11e 8th WSEAS Int. Conf. on Acoustics & Music: Theory & Applications, Vancouver, Canada. June 19-21, 2007 Spatialization and Timbre for Effective Auditory Graphing HONG JUN SONG and
More informationPerception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.
Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions
More informationSpeech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter
Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,
More informationA BROADBAND BEAMFORMER USING CONTROLLABLE CONSTRAINTS AND MINIMUM VARIANCE
A BROADBAND BEAMFORMER USING CONTROLLABLE CONSTRAINTS AND MINIMUM VARIANCE Sam Karimian-Azari, Jacob Benesty,, Jesper Rindom Jensen, and Mads Græsbøll Christensen Audio Analysis Lab, AD:MT, Aalborg University,
More informationE : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21
E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More informationSOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4
SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationSignal Characterization in terms of Sinusoidal and Non-Sinusoidal Components
Signal Characterization in terms of Sinusoidal and Non-Sinusoidal Components Geoffroy Peeters, avier Rodet To cite this version: Geoffroy Peeters, avier Rodet. Signal Characterization in terms of Sinusoidal
More informationFundamentals of Music Technology
Fundamentals of Music Technology Juan P. Bello Office: 409, 4th floor, 383 LaFayette Street (ext. 85736) Office Hours: Wednesdays 2-5pm Email: jpbello@nyu.edu URL: http://homepages.nyu.edu/~jb2843/ Course-info:
More informationREAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION
REAL-TIME BLIND SOURCE SEPARATION FOR MOVING SPEAKERS USING BLOCKWISE ICA AND RESIDUAL CROSSTALK SUBTRACTION Ryo Mukai Hiroshi Sawada Shoko Araki Shoji Makino NTT Communication Science Laboratories, NTT
More informationWavelet Speech Enhancement based on the Teager Energy Operator
Wavelet Speech Enhancement based on the Teager Energy Operator Mohammed Bahoura and Jean Rouat ERMETIS, DSA, Université du Québec à Chicoutimi, Chicoutimi, Québec, G7H 2B1, Canada. Abstract We propose
More informationSpeech Enhancement Techniques using Wiener Filter and Subspace Filter
IJSTE - International Journal of Science Technology & Engineering Volume 3 Issue 05 November 2016 ISSN (online): 2349-784X Speech Enhancement Techniques using Wiener Filter and Subspace Filter Ankeeta
More informationL19: Prosodic modification of speech
L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture
More informationESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS
ESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS Joonas Nikunen, Tuomas Virtanen Tampere University of Technology Korkeakoulunkatu
More informationSurvey Paper on Music Beat Tracking
Survey Paper on Music Beat Tracking Vedshree Panchwadkar, Shravani Pande, Prof.Mr.Makarand Velankar Cummins College of Engg, Pune, India vedshreepd@gmail.com, shravni.pande@gmail.com, makarand_v@rediffmail.com
More information