Environmental Sound Recognition using MP-based Features
|
|
- Lucy Houston
- 6 years ago
- Views:
Transcription
1 Environmental Sound Recognition using MP-based Features Selina Chu, Shri Narayanan *, and C.-C. Jay Kuo * Speech Analysis and Interpretation Lab Signal & Image Processing Institute Department of Computer Science * Department of Electrical Engineering University of Southern California Los Angeles, CA selinach@sipi.usc.edu Apr 01, 2008
2 Outline What are environmental sound and challenges in recognizing them Feature extraction using Matching Pursuit (MP-Features) Obtaining MP-Features Experimental setup and results Conclusion and future work
3 Environmental sounds What we hear everyday, anywhere Restaurants, streets, parks, airport and train stations, hallway, etc Unlike speech and music, which are structured sounds Speech formantic structure, i.e. vowels and consonants Music harmonic structure, i.e. notes Environmental sounds are unstructured and similar to noise Variably composed Thus, difficult to build models Can we incorporate such recognition and understanding in an automatic classification system?
4 Why recognize environment sound? Use audio information to assist activities, such as robotic navigation and human-computer interactions Vision-based robot has limitations Requires much world knowledge Lighting problems (or lack of), and angle of the camera Mitigate the limitations by incorporating audio information into the recognition process Fusion of audio and visual information Capture additional, semantically richer information Disambiguation of environment and object types Audio data can be easily acquired and is computationally cheaper to process than visual data Other applications: surveillance, search and rescue, obstacle detection, wearable devices, and other contextaware applications
5 Difficulties in acoustic environment classification Environment sounds are dynamic and unpredictable Same location, different time different sounds Streets: Characterize by cars and people, but could be absent (other additional sounds) Sounding similar Different location same sounds Many people in restaurant and bars sounds the same as in a crowded mall Different sounds similar sounding Noise of running water in a bath tub is similar to a running stream in the mountains
6 Audio classification results from current literature Structured Unstructured Environmental Music Speech, music, nonspeech, silent Number of classes Overall accuracy % Recognizing general acoustic environment: Restaurants, street, schoolyard Recognizing discrete acoustic events: Door knocks, gun shots, laughter, applause Recognition rates general acoustic environment using conventional features (i.e. MFCCs): 92% for 5 classes (Chu et al.), 77% for 11 classes (Malkin et al.), and 60% for 13 or more classes (Eronen et al.) Due to randomness, high variance and other difficulties in working with various environments
7 Intuitions about environmental sounds Nature-nighttime Nature-daytime Different environments have different characteristics On Boat Harbor Decompositions are noticeably different from one another There are underlying structures within each type of signal Near Highway Subway platform Need features that can discover them and to capture these time-domain signatures
8 Audio features Conventional features for audio recognition/classification does not perform well with environmental sounds MFCCs, MFCC derivatives, sub-band energy, fundamental frequency, LPCCs, energy, zerocrossing, and spectral- centroid, bandwidth, roll-off, flux, flatness One of the most used features for any audio classification are MFCCs MFCCs describe the shape of the overall spectrum Favorable for modeling single sound sources Works well for structured sounds such as speech and music Performance degrades in the presence of noise Limitations of MFCCs and other commonly-used features, difficulties in describing Finer resolution of temporal characteristics Dynamic aspects of the signal In general, few temporal-domain features have been used to characterize audio signals in the past
9 MP-Features MP-Features are capable of capturing the time-domain signature of a signal in a concise way Use the matching pursuit (MP) algorithm to analyze environment sounds Matching pursuit algorithm (Mallet and Zhang, 1993) Builds up a sequence of sparse approximation stepwise Process includes Finding the decomposition of a signal from a dictionary of atoms Yield the best set of time-domain functions to form an approximate representation Using the information found from that set to find the MP-Features
10 Dictionaries Quality of MP depends on the chosen dictionary (or dictionaries) A dictionary contains: Set of bases, or simply parameterized waveforms Examples of dictionaries includes: Gabor, Fourier, wavelets, Cosine packet, Chirplets, Warplets, and many others. Desirable characteristics: Should be complete or overcomplete MP is guaranteed to converge (zero energy residual) Important for the atoms in the dictionary to be discriminative among themselves - Otherwise similar atoms will compete with each other in the MP process
11 Examples of various dictionaries Frequency dictionary: Fourier Time-scale dictionary: Haar Time-Frequency dictionary: Gabor Fourier, MSE=23.87 Haar, MSE=12.40 Gabor, MSE=9.91 Original signal Example of first five atoms found on the same signal, but different dictionaries. Approximation Example of reconstruction using first 10 atoms, with different dictionaries
12 Process of obtaining MP features For each sampling window, Decompose using the MP algorithm MP algorithm is stopped after obtaining n atoms n is determined experimentally Decode each atom with its original parameters obtaining frequency, scale, and shift Accumulate all the atoms parameters Find the mean and standard deviation for each parameter separately
13 Finding n atoms Classification performance levels off after 4 or 5 atoms Recognition accuracy First n atoms used as features Larger number increases the complexity and makes it more specific to each data item Smaller number represent data in more general way
14 A small example Plotting of MP-features Captures time and frequency simultaneously 105 Forming clear natural clusters automatically Shifts indices Raining Stream / River 90 Nature-nighttime Nature-daytime For visual purpose, just plotted the centroids Frequency indices School playground Scales indices
15 Experimental setup Dataset: audio clips of 14 environmental sounds Audio sources: Recording of natural (unsynthesized) ambient sounds BBC sound effects - original series [9], Freesound project [10] Each sound files were of varying lengths (1-3 minutes each) Manually labeled and separated into 4-second segments Feature Extraction: features are analyzed and extracted for every 30ms rectangular window frame, with 15 ms overlap MFCC (12), MP-Features (6) Dictionary: Gabor function Classifier: Gaussian Mixture Model (GMM) 4-fold cross-validation - Separate sound sources, trained on data from 3 different sources and tested data from 1 source - Minimum number of files for a class was 4
16 14 types of environment sounds 1. Inside moving vehicles 2. Restaurant 3. Casino 4. Nature daytime 5. Nature nighttime 6. Street with police cars 7. Street with ambulance 8. Street with traffic and pedestrians 9. Playground 10. Raining 11. River/running water 12. Thundering 13. Train passing 14. Waves Choosing the types Diverse in characteristics or how they sound distinctively different from one another Homogeneous enough so they provide typical representation of the class Similar types of sounds as other works in literature
17 Classification Results Classification Rate % MFCCs (12) MP-Features (6) MP+MFCC Inside moving vehicles Casino Naturedaytime Restaurant Naturenighttime Street with Police car School Playground Street with traffic MFCC tend to operate on the extremes, doing better than MP-features alone in 7 classes, but very poorly (0-5%) in 5 classes MP-features range between 35%-100% MFCC and MP-features provide a complimentary effect for one another MFCC only: 68%, MP-feature only: 71%, MP-features+MFCC: 83% Raining Running water Thundering Train Passing Waves Street with ambulance
18 A closer look Nature-nighttime class Contains many insect sounds of higher frequencies (noise-like, flat spectrum) Characterized by narrow spectral peaks MFCC have difficulties encoding the narrow-band structure (i.e. engine sound of train passing) MFCCs: 0%, MP: 100%, MFCC+MP: 100% School playground Contains children playing and screaming and other outdoor ambient sounds, i.e. birds, passing traffic, etc MFCCs: 75%, MP: 60%, MFCC+MP: 95% MFCC acts as better discriminator for more structured sounds, like vocal sounds (i.e. talking inside restaurant) Nature-nighttime x School playground
19 Conclusion Introduced a feature extraction framework that provides a way to capture temporal-spectral characteristics of an audio signal Uncover underlying structures within each type of signal Less sensitive to noise and are able to represent sound originating from different sources and different ranges Use matching pursuit for feature extraction and its application to unstructured audio processing MP-features can supplement MFCCs to yield a higher classification performance for environmental sounds than using conventional features
20 Thank you Speech Analysis and Interpretation Laboratory: Acknowledgement: This work was supported by the National Science Foundation, DHS and the Army. Questions?
Applications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationCampus Location Recognition using Audio Signals
1 Campus Location Recognition using Audio Signals James Sun,Reid Westwood SUNetID:jsun2015,rwestwoo Email: jsun2015@stanford.edu, rwestwoo@stanford.edu I. INTRODUCTION People use sound both consciously
More informationIntroduction of Audio and Music
1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,
More informationA multi-class method for detecting audio events in news broadcasts
A multi-class method for detecting audio events in news broadcasts Sergios Petridis, Theodoros Giannakopoulos, and Stavros Perantonis Computational Intelligence Laboratory, Institute of Informatics and
More informationAn Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet
Journal of Information & Computational Science 8: 14 (2011) 3027 3034 Available at http://www.joics.com An Audio Fingerprint Algorithm Based on Statistical Characteristics of db4 Wavelet Jianguo JIANG
More informationDifferent Approaches of Spectral Subtraction Method for Speech Enhancement
ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches
More informationInternational Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015
International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha
More informationA Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification
A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department
More informationAutomatic Text-Independent. Speaker. Recognition Approaches Using Binaural Inputs
Automatic Text-Independent Speaker Recognition Approaches Using Binaural Inputs Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader 1 Outline Automatic speaker recognition: introduction Designed systems
More informationSinging Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection
Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation
More informationAudio Signal Compression using DCT and LPC Techniques
Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,
More informationAdvanced Techniques for Mobile Robotics Location-Based Activity Recognition
Advanced Techniques for Mobile Robotics Location-Based Activity Recognition Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Activity Recognition Based on L. Liao, D. J. Patterson, D. Fox,
More informationRobust Low-Resource Sound Localization in Correlated Noise
INTERSPEECH 2014 Robust Low-Resource Sound Localization in Correlated Noise Lorin Netsch, Jacek Stachurski Texas Instruments, Inc. netsch@ti.com, jacek@ti.com Abstract In this paper we address the problem
More informationSound Recognition. ~ CSE 352 Team 3 ~ Jason Park Evan Glover. Kevin Lui Aman Rawat. Prof. Anita Wasilewska
Sound Recognition ~ CSE 352 Team 3 ~ Jason Park Evan Glover Kevin Lui Aman Rawat Prof. Anita Wasilewska What is Sound? Sound is a vibration that propagates as a typically audible mechanical wave of pressure
More informationDETECTION AND CLASSIFICATION OF POWER QUALITY DISTURBANCES
DETECTION AND CLASSIFICATION OF POWER QUALITY DISTURBANCES Ph.D. THESIS by UTKARSH SINGH INDIAN INSTITUTE OF TECHNOLOGY ROORKEE ROORKEE-247 667 (INDIA) OCTOBER, 2017 DETECTION AND CLASSIFICATION OF POWER
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland Kong Aik Lee and
More informationAdvanced Data Analysis Pattern Recognition & Neural Networks Software for Acoustic Emission Applications. Topic: Waveforms in Noesis
Advanced Data Analysis Pattern Recognition & Neural Networks Software for Acoustic Emission Applications Topic: Waveforms in Noesis 1 Noesis Waveforms Capabilities Noesis main features relating to Waveforms:
More informationPerformance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches
Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art
More informationQuantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation
Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University
More informationHeadScan: A Wearable System for Radio-based Sensing of Head and Mouth-related Activities
HeadScan: A Wearable System for Radio-based Sensing of Head and Mouth-related Activities Biyi Fang Department of Electrical and Computer Engineering Michigan State University Biyi Fang Nicholas D. Lane
More informationTE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION
TE 302 DISCRETE SIGNALS AND SYSTEMS Study on the behavior and processing of information bearing functions as they are currently used in human communication and the systems involved. Chapter 1: INTRODUCTION
More informationElectric Guitar Pickups Recognition
Electric Guitar Pickups Recognition Warren Jonhow Lee warrenjo@stanford.edu Yi-Chun Chen yichunc@stanford.edu Abstract Electric guitar pickups convert vibration of strings to eletric signals and thus direcly
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationLearning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives
Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri
More informationKeywords Decomposition; Reconstruction; SNR; Speech signal; Super soft Thresholding.
Volume 5, Issue 2, February 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Speech Enhancement
More informationMachine recognition of speech trained on data from New Jersey Labs
Machine recognition of speech trained on data from New Jersey Labs Frequency response (peak around 5 Hz) Impulse response (effective length around 200 ms) 41 RASTA filter 10 attenuation [db] 40 1 10 modulation
More informationAN INVESTIGATION INTO SALIENCY-BASED MARS ROI DETECTION
AN INVESTIGATION INTO SALIENCY-BASED MARS ROI DETECTION Lilan Pan and Dave Barnes Department of Computer Science, Aberystwyth University, UK ABSTRACT This paper reviews several bottom-up saliency algorithms.
More informationSignals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2
Signals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2 The Fourier transform of single pulse is the sinc function. EE 442 Signal Preliminaries 1 Communication Systems and
More informationSpeech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm
International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,
More informationGammatone Cepstral Coefficient for Speaker Identification
Gammatone Cepstral Coefficient for Speaker Identification Rahana Fathima 1, Raseena P E 2 M. Tech Student, Ilahia college of Engineering and Technology, Muvattupuzha, Kerala, India 1 Asst. Professor, Ilahia
More informationMUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES. P.S. Lampropoulou, A.S. Lampropoulos and G.A.
MUSICAL GENRE CLASSIFICATION OF AUDIO DATA USING SOURCE SEPARATION TECHNIQUES P.S. Lampropoulou, A.S. Lampropoulos and G.A. Tsihrintzis Department of Informatics, University of Piraeus 80 Karaoli & Dimitriou
More informationALTERNATING CURRENT (AC)
ALL ABOUT NOISE ALTERNATING CURRENT (AC) Any type of electrical transmission where the current repeatedly changes direction, and the voltage varies between maxima and minima. Therefore, any electrical
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationMATLAB DIGITAL IMAGE/SIGNAL PROCESSING TITLES
MATLAB DIGITAL IMAGE/SIGNAL PROCESSING TITLES -2018 S.NO PROJECT CODE 1 ITIMP01 2 ITIMP02 3 ITIMP03 4 ITIMP04 5 ITIMP05 6 ITIMP06 7 ITIMP07 8 ITIMP08 9 ITIMP09 `10 ITIMP10 11 ITIMP11 12 ITIMP12 13 ITIMP13
More informationAn Optimization of Audio Classification and Segmentation using GASOM Algorithm
An Optimization of Audio Classification and Segmentation using GASOM Algorithm Dabbabi Karim, Cherif Adnen Research Unity of Processing and Analysis of Electrical and Energetic Systems Faculty of Sciences
More informationRhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University
Rhythmic Similarity -- a quick paper review Presented by: Shi Yong March 15, 2007 Music Technology, McGill University Contents Introduction Three examples J. Foote 2001, 2002 J. Paulus 2002 S. Dixon 2004
More informationA Method for Voiced/Unvoiced Classification of Noisy Speech by Analyzing Time-Domain Features of Spectrogram Image
Science Journal of Circuits, Systems and Signal Processing 2017; 6(2): 11-17 http://www.sciencepublishinggroup.com/j/cssp doi: 10.11648/j.cssp.20170602.12 ISSN: 2326-9065 (Print); ISSN: 2326-9073 (Online)
More informationLong Range Acoustic Classification
Approved for public release; distribution is unlimited. Long Range Acoustic Classification Authors: Ned B. Thammakhoune, Stephen W. Lang Sanders a Lockheed Martin Company P. O. Box 868 Nashua, New Hampshire
More informationIDENTIFICATION OF SIGNATURES TRANSMITTED OVER RAYLEIGH FADING CHANNEL BY USING HMM AND RLE
International Journal of Technology (2011) 1: 56 64 ISSN 2086 9614 IJTech 2011 IDENTIFICATION OF SIGNATURES TRANSMITTED OVER RAYLEIGH FADING CHANNEL BY USING HMM AND RLE Djamhari Sirat 1, Arman D. Diponegoro
More informationVoiced/nonvoiced detection based on robustness of voiced epochs
Voiced/nonvoiced detection based on robustness of voiced epochs by N. Dhananjaya, B.Yegnanarayana in IEEE Signal Processing Letters, 17, 3 : 273-276 Report No: IIIT/TR/2010/50 Centre for Language Technologies
More informationBasic Characteristics of Speech Signal Analysis
www.ijird.com March, 2016 Vol 5 Issue 4 ISSN 2278 0211 (Online) Basic Characteristics of Speech Signal Analysis S. Poornima Assistant Professor, VlbJanakiammal College of Arts and Science, Coimbatore,
More informationAdvanced Functions of Java-DSP for use in Electrical and Computer Engineering Senior Level Courses
Advanced Functions of Java-DSP for use in Electrical and Computer Engineering Senior Level Courses Andreas Spanias Robert Santucci Tushar Gupta Mohit Shah Karthikeyan Ramamurthy Topics This presentation
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationMultimedia Signal Processing: Theory and Applications in Speech, Music and Communications
Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal
More informationSpeech and Music Discrimination based on Signal Modulation Spectrum.
Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we
More informationStructure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping
Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics
More informationMeasuring the complexity of sound
PRAMANA c Indian Academy of Sciences Vol. 77, No. 5 journal of November 2011 physics pp. 811 816 Measuring the complexity of sound NANDINI CHATTERJEE SINGH National Brain Research Centre, NH-8, Nainwal
More informationSpeech/Music Discrimination via Energy Density Analysis
Speech/Music Discrimination via Energy Density Analysis Stanis law Kacprzak and Mariusz Zió lko Department of Electronics, AGH University of Science and Technology al. Mickiewicza 30, Kraków, Poland {skacprza,
More informationCOMP 546, Winter 2017 lecture 20 - sound 2
Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering
More informationReading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.
L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationEC 6501 DIGITAL COMMUNICATION UNIT - II PART A
EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing
More informationAN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS
AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS Kuldeep Kumar 1, R. K. Aggarwal 1 and Ankita Jain 2 1 Department of Computer Engineering, National Institute
More informationAn analysis of blind signal separation for real time application
University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2006 An analysis of blind signal separation for real time application
More informationDERIVATION OF TRAPS IN AUDITORY DOMAIN
DERIVATION OF TRAPS IN AUDITORY DOMAIN Petr Motlíček, Doctoral Degree Programme (4) Dept. of Computer Graphics and Multimedia, FIT, BUT E-mail: motlicek@fit.vutbr.cz Supervised by: Dr. Jan Černocký, Prof.
More informationChapter 5. Signal Analysis. 5.1 Denoising fiber optic sensor signal
Chapter 5 Signal Analysis 5.1 Denoising fiber optic sensor signal We first perform wavelet-based denoising on fiber optic sensor signals. Examine the fiber optic signal data (see Appendix B). Across all
More informationComplex Sounds. Reading: Yost Ch. 4
Complex Sounds Reading: Yost Ch. 4 Natural Sounds Most sounds in our everyday lives are not simple sinusoidal sounds, but are complex sounds, consisting of a sum of many sinusoids. The amplitude and frequency
More informationAn Improved Voice Activity Detection Based on Deep Belief Networks
e-issn 2455 1392 Volume 2 Issue 4, April 2016 pp. 676-683 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com An Improved Voice Activity Detection Based on Deep Belief Networks Shabeeba T. K.
More informationPerformance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume - 3 Issue - 8 August, 2014 Page No. 7727-7732 Performance Analysis of MFCC and LPCC Techniques in Automatic
More informationLinguistic Phonetics. Spectral Analysis
24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There
More informationRoberto Togneri (Signal Processing and Recognition Lab)
Signal Processing and Machine Learning for Power Quality Disturbance Detection and Classification Roberto Togneri (Signal Processing and Recognition Lab) Power Quality (PQ) disturbances are broadly classified
More informationHIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM
HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM DR. D.C. DHUBKARYA AND SONAM DUBEY 2 Email at: sonamdubey2000@gmail.com, Electronic and communication department Bundelkhand
More informationClassification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise
Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise Noha KORANY 1 Alexandria University, Egypt ABSTRACT The paper applies spectral analysis to
More informationSpeech Synthesis; Pitch Detection and Vocoders
Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech
More information8.3 Basic Parameters for Audio
8.3 Basic Parameters for Audio Analysis Physical audio signal: simple one-dimensional amplitude = loudness frequency = pitch Psycho-acoustic features: complex A real-life tone arises from a complex superposition
More informationOriginal Research Articles
Original Research Articles Researchers A.K.M Fazlul Haque Department of Electronics and Telecommunication Engineering Daffodil International University Emailakmfhaque@daffodilvarsity.edu.bd FFT and Wavelet-Based
More informationImage Denoising Using Complex Framelets
Image Denoising Using Complex Framelets 1 N. Gayathri, 2 A. Hazarathaiah. 1 PG Student, Dept. of ECE, S V Engineering College for Women, AP, India. 2 Professor & Head, Dept. of ECE, S V Engineering College
More informationNoise Attenuation in Seismic Data Iterative Wavelet Packets vs Traditional Methods Lionel J. Woog, Igor Popovic, Anthony Vassiliou, GeoEnergy, Inc.
Noise Attenuation in Seismic Data Iterative Wavelet Packets vs Traditional Methods Lionel J. Woog, Igor Popovic, Anthony Vassiliou, GeoEnergy, Inc. Summary In this document we expose the ideas and technologies
More informationOBJECTIVE OF THE BOOK ORGANIZATION OF THE BOOK
xv Preface Advancement in technology leads to wide spread use of mounting cameras to capture video imagery. Such surveillance cameras are predominant in commercial institutions through recording the cameras
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationINTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006
1. Resonators and Filters INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 Different vibrating objects are tuned to specific frequencies; these frequencies at which a particular
More informationClassification of Road Images for Lane Detection
Classification of Road Images for Lane Detection Mingyu Kim minkyu89@stanford.edu Insun Jang insunj@stanford.edu Eunmo Yang eyang89@stanford.edu 1. Introduction In the research on autonomous car, it is
More informationDetection and Classification of Nonstationary Transient Signals Using Sparse Approximations and Bayesian Networks
1750 IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2014 Detection and Classification of Nonstationary Transient Signals Using Sparse Approximations and Bayesian
More informationProject 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing
Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You
More informationBiomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar
Biomedical Signals Signals and Images in Medicine Dr Nabeel Anwar Noise Removal: Time Domain Techniques 1. Synchronized Averaging (covered in lecture 1) 2. Moving Average Filters (today s topic) 3. Derivative
More informationAudio Imputation Using the Non-negative Hidden Markov Model
Audio Imputation Using the Non-negative Hidden Markov Model Jinyu Han 1,, Gautham J. Mysore 2, and Bryan Pardo 1 1 EECS Department, Northwestern University 2 Advanced Technology Labs, Adobe Systems Inc.
More informationSONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS
SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS AKSHAY CHANDRASHEKARAN ANOOP RAMAKRISHNA akshayc@cmu.edu anoopr@andrew.cmu.edu ABHISHEK JAIN GE YANG ajain2@andrew.cmu.edu younger@cmu.edu NIDHI KOHLI R
More informationANALYSIS OF ACOUSTIC FEATURES FOR AUTOMATED MULTI-TRACK MIXING
th International Society for Music Information Retrieval Conference (ISMIR ) ANALYSIS OF ACOUSTIC FEATURES FOR AUTOMATED MULTI-TRACK MIXING Jeffrey Scott, Youngmoo E. Kim Music and Entertainment Technology
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationCOMPARITIVE STUDY OF IMAGE DENOISING ALGORITHMS IN MEDICAL AND SATELLITE IMAGES
COMPARITIVE STUDY OF IMAGE DENOISING ALGORITHMS IN MEDICAL AND SATELLITE IMAGES Jyotsana Rastogi, Diksha Mittal, Deepanshu Singh ---------------------------------------------------------------------------------------------------------------------------------
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/
More informationCLASSIFICATION OF MULTIPLE SIGNALS USING 2D MATCHING OF MAGNITUDE-FREQUENCY DENSITY FEATURES
Proceedings of the SDR 11 Technical Conference and Product Exposition, Copyright 2011 Wireless Innovation Forum All Rights Reserved CLASSIFICATION OF MULTIPLE SIGNALS USING 2D MATCHING OF MAGNITUDE-FREQUENCY
More informationA Study on Single Camera Based ANPR System for Improvement of Vehicle Number Plate Recognition on Multi-lane Roads
Invention Journal of Research Technology in Engineering & Management (IJRTEM) ISSN: 2455-3689 www.ijrtem.com Volume 2 Issue 1 ǁ January. 2018 ǁ PP 11-16 A Study on Single Camera Based ANPR System for Improvement
More informationComposite Fractional Power Wavelets Jason M. Kinser
Composite Fractional Power Wavelets Jason M. Kinser Inst. for Biosciences, Bioinformatics, & Biotechnology George Mason University jkinser@ib3.gmu.edu ABSTRACT Wavelets have a tremendous ability to extract
More informationNoise estimation and power spectrum analysis using different window techniques
IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 78-1676,p-ISSN: 30-3331, Volume 11, Issue 3 Ver. II (May. Jun. 016), PP 33-39 www.iosrjournals.org Noise estimation and power
More informationUltra-Wideband Compressed Sensing: Channel Estimation Jose L. Paredes, Member, IEEE, Gonzalo R. Arce, Fellow, IEEE, and Zhongmin Wang
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 1, NO. 3, OCTOBER 2007 383 Ultra-Wideband Compressed Sensing: Channel Estimation Jose L. Paredes, Member, IEEE, Gonzalo R. Arce, Fellow, IEEE,
More informationThe Jigsaw Continuous Sensing Engine for Mobile Phone Applications!
The Jigsaw Continuous Sensing Engine for Mobile Phone Applications! Hong Lu, Jun Yang, Zhigang Liu, Nicholas D. Lane, Tanzeem Choudhury, Andrew T. Campbell" CS Department Dartmouth College Nokia Research
More informationAD-A 'L-SPv1-17
APPLIED RESEARCH LABORATORIES.,THE UNIVERSITY OF TEXAS AT AUSTIN P. 0. Box 8029 Aujn. '"X.zs,37 l.3-s029( 512),35-i2oT- FA l. 512) i 5-259 AD-A239 335'L-SPv1-17 &g. FLECTE Office of Naval Research AUG
More informationVoice Activity Detection
Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class
More informationAudio processing methods on marine mammal vocalizations
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio http://labrosa.ee.columbia.edu Sound to Signal sound is pressure
More informationVQ Source Models: Perceptual & Phase Issues
VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu
More informationImage analysis. CS/CME/BioE/Biophys/BMI 279 Oct. 31 and Nov. 2, 2017 Ron Dror
Image analysis CS/CME/BioE/Biophys/BMI 279 Oct. 31 and Nov. 2, 2017 Ron Dror 1 Outline Images in molecular and cellular biology Reducing image noise Mean and Gaussian filters Frequency domain interpretation
More informationOrthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich *
Orthonormal bases and tilings of the time-frequency plane for music processing Juan M. Vuletich * Dept. of Computer Science, University of Buenos Aires, Argentina ABSTRACT Conventional techniques for signal
More informationDeep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices
Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices Daniele Ravì, Charence Wong, Benny Lo and Guang-Zhong Yang To appear in the proceedings of the IEEE
More informationInternational Journal of Digital Application & Contemporary research Website: (Volume 1, Issue 7, February 2013)
Performance Analysis of OFDM under DWT, DCT based Image Processing Anshul Soni soni.anshulec14@gmail.com Ashok Chandra Tiwari Abstract In this paper, the performance of conventional discrete cosine transform
More informationChapter 4 SPEECH ENHANCEMENT
44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or
More informationMikko Myllymäki and Tuomas Virtanen
NON-STATIONARY NOISE MODEL COMPENSATION IN VOICE ACTIVITY DETECTION Mikko Myllymäki and Tuomas Virtanen Department of Signal Processing, Tampere University of Technology Korkeakoulunkatu 1, 3370, Tampere,
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationTarget detection in side-scan sonar images: expert fusion reduces false alarms
Target detection in side-scan sonar images: expert fusion reduces false alarms Nicola Neretti, Nathan Intrator and Quyen Huynh Abstract We integrate several key components of a pattern recognition system
More informationDesign and Implementation of an Audio Classification System Based on SVM
Available online at www.sciencedirect.com Procedia ngineering 15 (011) 4031 4035 Advanced in Control ngineering and Information Science Design and Implementation of an Audio Classification System Based
More information