Minimal-Impact Audio-Based Personal Archives
|
|
- Justina Evans
- 5 years ago
- Views:
Transcription
1 Minimal-Impact Audio-Based Personal Archives Dan Ellis and Keansub Lee Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA 1. Personal Audio Archives 2. Features 3. Segmentation 4. Clustering 5. Privacy 6. Future Work
2 1. Personal Audio Easy to record everything you hear <2GB / 64 kbps Very hard to find anything how to scan? how to visualize? how to index? Need automatic analysis Need minimal impact
3 Applications Automatic appointment-book history fills in when & where of movements Life statistics how long did I spend in meetings this week vs. last most frequent conversations favorite phrases?? Retrieving details what exactly did I promise? privacy issues... Nostalgia?
4 Data Set Starting point: Collect data 62 hours recorded (8 days, ~7.5 hr/day) hand-mark 139 segments, 16 classes Label total mins total segs Library Campus Restaurant Bowling Lecture Car/Taxi Street minimal impact?
5 2. Features Long duration recordings may benefit from longer basic time-frames 60s rather than 10ms? Perceptually-motivated features broad spectrum + some detail? For diary application... background more important than foreground? smooth out uncharacteristic transients
6 Feature sets Average Linear Energy 1 Normalized Energy Deviation 60 freq / bark freq / bark Average Log Energy 60 db 1 Log Energy Deviation db 15 freq / bark freq / bark Average Spectral Entropy db freq / bark freq / bark Spectral Entropy Deviation 10 5 db bits time / min Capture both average and variation Capture a little more detail in subbands... bits
7 Spectral Entropy Auditory spectrum: Spectral entropy peakiness of each band: H[n, j] = N F! k=0 w jk X[n,k] A[n, j] A[n, j] = N! F w jk X[n,k] k=0 ( ) w jk X[n,k] log A[n, j] energy / db FFT spectral magnitude Auditory Spectrum rel. entropy / bits per-band Spectral Entropies freq / Hz
8 3. BIC segmentation BIC (Bayesian Information Criterion): Compare more and less complex models log L(X 1;M 1 )L(X 2 ;M 2 ) L(X;M 0 ) λ 2 log(n) #(M) For segmentation: Grow context window from current boundary For each window, test every possible segmentation When BIC is positive, mark new segment last segmentation point candidate boundary current context limit 0 N time L(X 1 ;M 1 ) L(X 2 ;M 2 ) L(X;M 0 )
9 BIC Segmentation Example _AvgLEnergy AvgLogAudSpec BIC score last seg point no boundary found with shorter window 13:30 14:00 14:30 15:00 15:30 16:00 No training or stored models boundary passes BIC current window limit time / hr
10 Segmentation Results Evaluate: 60hr hand-marked boundaries different features & combinations Correct Accept False Accept = 2%: Feature Correct Accept µdb 80.8% µh 81.1% σh/µh 81.6% µdb + σh/µh 84.0% µdb + σh/µh + µh 83.6% avg. mfcc 73.6% Sensitivity µ db µ H! H /µ H µ db +! H /µ H µ db + µ H +! H /µ H Specificity
11 4. Segment clustering Daily activity has lots of repetition: Automatically cluster similar segments affinity of segments as KL2 distances supermkt meeting karaoke barber lecture2 billiard break lecture1 car/taxi home bowling street restaurant library campus cmp lib rst str
12 Spectral Clustering Eigenanalysis of affinity matrix: A = U S V Affinity Matrix SVD components: u k s kk v k ' k=1 k= k=3 k= eigenvectors v k give cluster memberships Number of clusters?
13 Clustering Results Clustering of automatic segments gives anonymous classes BIC criterion to choose number of clusters make best correspondence to 16 GT clusters Frame-level scoring gives ~70% correct errors when same place has multiple ambiences
14 5. Privacy Recording conversations conflicts with expectations of privacy critical barrier to progress Technical solutions to improve acceptance? Speaker/speech search and destroy scramble 100ms segs of speech (preserving longer-term statistics) high-confidence speaker ID to bypass
15 Speech Scrambling Permute 0 ms segments within 1 s blocks removes intelligibility preserves local structure segment features almost unchanged freq / khz freq / khz Original (dan+kean-ex.wav) Scrambled (0ms wins over 1s) level / db time / s
16 Visualization / browsing / diary inference link in other information sources - diary - What is it good for? NoteTaker interface 6. Future Work
17 Conclusions Personal Audio is easy & cheap to collect but is it any use? Boundaries quite easy to spot e.g. moving to a new location Repeated activities can cluster together.. so user s labels can propagate Still gaining experience with the data speech, speaker ID, privacy,...
Preservation and recollection of facts
Capture, Archival, and Retrieval of Personal Experience Accessing Minimal-Impact Personal Audio Archives We ve collected personal audio essentially everything we hear for two years and have experimented
More informationVQ Source Models: Perceptual & Phase Issues
VQ Source Models: Perceptual & Phase Issues Dan Ellis & Ron Weiss Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,ronw}@ee.columbia.edu
More informationSpeech/Music Change Point Detection using Sonogram and AANN
International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change
More informationAdvanced audio analysis. Martin Gasser
Advanced audio analysis Martin Gasser Motivation Which methods are common in MIR research? How can we parameterize audio signals? Interesting dimensions of audio: Spectral/ time/melody structure, high
More informationReduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter
Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC
More informationMel- frequency cepstral coefficients (MFCCs) and gammatone filter banks
SGN- 14006 Audio and Speech Processing Pasi PerQlä SGN- 14006 2015 Mel- frequency cepstral coefficients (MFCCs) and gammatone filter banks Slides for this lecture are based on those created by Katariina
More informationEstimating Single-Channel Source Separation Masks: Relevance Vector Machine Classifiers vs. Pitch-Based Masking
Estimating Single-Channel Source Separation Masks: Relevance Vector Machine Classifiers vs. Pitch-Based Masking Ron J. Weiss and Daniel P. W. Ellis LabROSA, Dept. of Elec. Eng. Columbia University New
More informationSignals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend
Signals & Systems for Speech & Hearing Week 6 Bandpass filters & filterbanks Practical spectral analysis Most analogue signals of interest are not easily mathematically specified so applying a Fourier
More informationChange Point Determination in Audio Data Using Auditory Features
INTL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 0, VOL., NO., PP. 8 90 Manuscript received April, 0; revised June, 0. DOI: /eletel-0-00 Change Point Determination in Audio Data Using Auditory Features
More informationREpeating Pattern Extraction Technique (REPET)
REpeating Pattern Extraction Technique (REPET) EECS 32: Machine Perception of Music & Audio Zafar RAFII, Spring 22 Repetition Repetition is a fundamental element in generating and perceiving structure
More informationDetecting proximity from personal audio recordings
Detecting proximity from personal audio recordings dpwe@ee.columbia.edu Dan Ellis, Hiroyuki Satoh, Zhuo Chen LabROSA, Columbia Univ., NY USA ICSI, Berkeley, CA, USA Morikawa lab, University of Tokyo, Tokyo,
More informationA multi-class method for detecting audio events in news broadcasts
A multi-class method for detecting audio events in news broadcasts Sergios Petridis, Theodoros Giannakopoulos, and Stavros Perantonis Computational Intelligence Laboratory, Institute of Informatics and
More informationImproving Meetings with Microphone Array Algorithms. Ivan Tashev Microsoft Research
Improving Meetings with Microphone Array Algorithms Ivan Tashev Microsoft Research Why microphone arrays? They ensure better sound quality: less noises and reverberation Provide speaker position using
More informationNOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or
NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying
More informationApplications of Music Processing
Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite
More informationPrinceton ELE 201, Spring 2014 Laboratory No. 2 Shazam
Princeton ELE 201, Spring 2014 Laboratory No. 2 Shazam 1 Background In this lab we will begin to code a Shazam-like program to identify a short clip of music using a database of songs. The basic procedure
More informationAudio Similarity. Mark Zadel MUMT 611 March 8, Audio Similarity p.1/23
Audio Similarity Mark Zadel MUMT 611 March 8, 2004 Audio Similarity p.1/23 Overview MFCCs Foote Content-Based Retrieval of Music and Audio (1997) Logan, Salomon A Music Similarity Function Based On Signal
More informationPART I: The questions in Part I refer to the aliasing portion of the procedure as outlined in the lab manual.
Lab. #1 Signal Processing & Spectral Analysis Name: Date: Section / Group: NOTE: To help you correctly answer many of the following questions, it may be useful to actually run the cases outlined in the
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationSinging Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection
Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation
More informationDigital Speech Processing and Coding
ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/
More informationAutomotive three-microphone voice activity detector and noise-canceller
Res. Lett. Inf. Math. Sci., 005, Vol. 7, pp 47-55 47 Available online at http://iims.massey.ac.nz/research/letters/ Automotive three-microphone voice activity detector and noise-canceller Z. QI and T.J.MOIR
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationAcoustics, signals & systems for audiology. Week 4. Signals through Systems
Acoustics, signals & systems for audiology Week 4 Signals through Systems Crucial ideas Any signal can be constructed as a sum of sine waves In a linear time-invariant (LTI) system, the response to a sinusoid
More informationMUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting
MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting Julius O. Smith III (jos@ccrma.stanford.edu) Center for Computer Research in Music and Acoustics (CCRMA)
More informationPreeti Rao 2 nd CompMusicWorkshop, Istanbul 2012
Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012 o Music signal characteristics o Perceptual attributes and acoustic properties o Signal representations for pitch detection o STFT o Sinusoidal model o
More informationSpeech Coding in the Frequency Domain
Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.
More information24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY /$ IEEE
24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY 2009 Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation Jiucang Hao, Hagai
More informationBrowsing Audio Life-log Data Using Acoustic and Location Information
Browsing Audio Life-log Data Using Acoustic and Location Information Kiichiro Yamano Graduate School of Computer and Information Sciences Hosei University 3-7-2 Kajino-cho, 184-8584 Koganei, Japan Email:
More informationLearning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives
Learning to Unlearn and Relearn Speech Signal Processing using Neural Networks: current and future perspectives Mathew Magimai Doss Collaborators: Vinayak Abrol, Selen Hande Kabil, Hannah Muckenhirn, Dimitri
More informationE : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21
E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1
More informationCase study for voice amplification in a highly absorptive conference room using negative absorption tuning by the YAMAHA Active Field Control system
Case study for voice amplification in a highly absorptive conference room using negative absorption tuning by the YAMAHA Active Field Control system Takayuki Watanabe Yamaha Commercial Audio Systems, Inc.
More informationCan binary masks improve intelligibility?
Can binary masks improve intelligibility? Mike Brookes (Imperial College London) & Mark Huckvale (University College London) Apparently so... 2 How does it work? 3 Time-frequency grid of local SNR + +
More informationHigh-speed Noise Cancellation with Microphone Array
Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent
More informationData and Computer Communications Chapter 3 Data Transmission
Data and Computer Communications Chapter 3 Data Transmission Eighth Edition by William Stallings Transmission Terminology data transmission occurs between a transmitter & receiver via some medium guided
More informationECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2
ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre
More informationEE 464 Short-Time Fourier Transform Fall and Spectrogram. Many signals of importance have spectral content that
EE 464 Short-Time Fourier Transform Fall 2018 Read Text, Chapter 4.9. and Spectrogram Many signals of importance have spectral content that changes with time. Let xx(nn), nn = 0, 1,, NN 1 1 be a discrete-time
More informationDistributed Speech Recognition Standardization Activity
Distributed Speech Recognition Standardization Activity Alex Sorin, Ron Hoory, Dan Chazan Telecom and Media Systems Group June 30, 2003 IBM Research Lab in Haifa Advanced Speech Enabled Services ASR App
More informationBuild Your Own Bose WaveRadio Bass Preamp Active Filter Design
EE230 Filter Laboratory Build Your Own Bose WaveRadio Bass Preamp Active Filter Design Objectives 1) Design an active filter on paper to meet a particular specification 2) Verify your design using Spice
More informationKeywords: spectral centroid, MPEG-7, sum of sine waves, band limited impulse train, STFT, peak detection.
Global Journal of Researches in Engineering: J General Engineering Volume 15 Issue 4 Version 1.0 Year 2015 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc.
More informationExperiments in two-tone interference
Experiments in two-tone interference Using zero-based encoding An alternative look at combination tones and the critical band John K. Bates Time/Space Systems Functions of the experimental system: Variable
More informationEE 438 Final Exam Spring 2000
2 May 2000 Name: EE 438 Final Exam Spring 2000 You have 120 minutes to work the following six problems. Each problem is worth 25 points. Be sure to show all your work to obtain full credit. The exam is
More informationTerminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Simplex. Direct link.
Chapter 3 Data Transmission Terminology (1) Transmitter Receiver Medium Guided medium e.g. twisted pair, optical fiber Unguided medium e.g. air, water, vacuum Corneliu Zaharia 2 Corneliu Zaharia Terminology
More informationSGN Audio and Speech Processing
SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although
More informationAdaptive Noise Reduction Algorithm for Speech Enhancement
Adaptive Noise Reduction Algorithm for Speech Enhancement M. Kalamani, S. Valarmathy, M. Krishnamoorthi Abstract In this paper, Least Mean Square (LMS) adaptive noise reduction algorithm is proposed to
More informationIsolated Digit Recognition Using MFCC AND DTW
MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics
More informationDiscriminative Training for Automatic Speech Recognition
Discriminative Training for Automatic Speech Recognition 22 nd April 2013 Advanced Signal Processing Seminar Article Heigold, G.; Ney, H.; Schluter, R.; Wiesler, S. Signal Processing Magazine, IEEE, vol.29,
More informationECMA TR/105. A Shaped Noise File Representative of Speech. 1 st Edition / December Reference number ECMA TR/12:2009
ECMA TR/105 1 st Edition / December 2012 A Shaped Noise File Representative of Speech Reference number ECMA TR/12:2009 Ecma International 2009 COPYRIGHT PROTECTED DOCUMENT Ecma International 2012 Contents
More informationI D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008
R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath
More informationEC 554 Data Communications
EC 554 Data Communications Mohamed Khedr http://webmail. webmail.aast.edu/~khedraast.edu/~khedr Syllabus Tentatively Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week
More informationLecture 6. Rhythm Analysis. (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller)
Lecture 6 Rhythm Analysis (some slides are adapted from Zafar Rafii and some figures are from Meinard Mueller) Definitions for Rhythm Analysis Rhythm: movement marked by the regulated succession of strong
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationData Communication. Chapter 3 Data Transmission
Data Communication Chapter 3 Data Transmission ١ Terminology (1) Transmitter Receiver Medium Guided medium e.g. twisted pair, coaxial cable, optical fiber Unguided medium e.g. air, water, vacuum ٢ Terminology
More informationBag-of-Features Acoustic Event Detection for Sensor Networks
Bag-of-Features Acoustic Event Detection for Sensor Networks Julian Kürby, René Grzeszick, Axel Plinge, and Gernot A. Fink Pattern Recognition, Computer Science XII, TU Dortmund University September 3,
More informationAnnouncements. Today. Speech and Language. State Path Trellis. HMMs: MLE Queries. Introduction to Artificial Intelligence. V22.
Introduction to Artificial Intelligence Announcements V22.0472-001 Fall 2009 Lecture 19: Speech Recognition & Viterbi Decoding Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from John
More informationRASTA-PLP SPEECH ANALYSIS. Aruna Bayya. Phil Kohn y TR December 1991
RASTA-PLP SPEECH ANALYSIS Hynek Hermansky Nelson Morgan y Aruna Bayya Phil Kohn y TR-91-069 December 1991 Abstract Most speech parameter estimation techniques are easily inuenced by the frequency response
More informationFundamental frequency estimation of speech signals using MUSIC algorithm
Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,
More informationElectrical & Computer Engineering Technology
Electrical & Computer Engineering Technology EET 419C Digital Signal Processing Laboratory Experiments by Masood Ejaz Experiment # 1 Quantization of Analog Signals and Calculation of Quantized noise Objective:
More informationSpeech Signal Analysis
Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for
More informationLecture 9: Time & Pitch Scaling
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 9: Time & Pitch Scaling 1. Time Scale Modification (TSM) 2. Time-Domain Approaches 3. The Phase Vocoder 4. Sinusoidal Approach Dan Ellis Dept. Electrical Engineering,
More informationDEEP LEARNING BASED AUTOMATIC VOLUME CONTROL AND LIMITER SYSTEM. Jun Yang (IEEE Senior Member), Philip Hilmes, Brian Adair, David W.
DEEP LEARNING BASED AUTOMATIC VOLUME CONTROL AND LIMITER SYSTEM Jun Yang (IEEE Senior Member), Philip Hilmes, Brian Adair, David W. Krueger Amazon Lab126, Sunnyvale, CA 94089, USA Email: {junyang, philmes,
More informationComparison of Spectral Analysis Methods for Automatic Speech Recognition
INTERSPEECH 2013 Comparison of Spectral Analysis Methods for Automatic Speech Recognition Venkata Neelima Parinam, Chandra Vootkuri, Stephen A. Zahorian Department of Electrical and Computer Engineering
More informationImproving room acoustics at low frequencies with multiple loudspeakers and time based room correction
Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction S.B. Nielsen a and A. Celestinos b a Aalborg University, Fredrik Bajers Vej 7 B, 9220 Aalborg Ø, Denmark
More informationAn Optimization of Audio Classification and Segmentation using GASOM Algorithm
An Optimization of Audio Classification and Segmentation using GASOM Algorithm Dabbabi Karim, Cherif Adnen Research Unity of Processing and Analysis of Electrical and Energetic Systems Faculty of Sciences
More informationSingle-channel Mixture Decomposition using Bayesian Harmonic Models
Single-channel Mixture Decomposition using Bayesian Harmonic Models Emmanuel Vincent and Mark D. Plumbley Electronic Engineering Department, Queen Mary, University of London Mile End Road, London E1 4NS,
More informationElectronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis
International Journal of Scientific and Research Publications, Volume 5, Issue 11, November 2015 412 Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis Shalate
More informationFFT analysis in practice
FFT analysis in practice Perception & Multimedia Computing Lecture 13 Rebecca Fiebrink Lecturer, Department of Computing Goldsmiths, University of London 1 Last Week Review of complex numbers: rectangular
More informationSingle Channel Speaker Segregation using Sinusoidal Residual Modeling
NCC 2009, January 16-18, IIT Guwahati 294 Single Channel Speaker Segregation using Sinusoidal Residual Modeling Rajesh M Hegde and A. Srinivas Dept. of Electrical Engineering Indian Institute of Technology
More informationSpeech and Music Discrimination based on Signal Modulation Spectrum.
Speech and Music Discrimination based on Signal Modulation Spectrum. Pavel Balabko June 24, 1999 1 Introduction. This work is devoted to the problem of automatic speech and music discrimination. As we
More informationPattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt
Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory
More informationDimension Reduction of the Modulation Spectrogram for Speaker Verification
Dimension Reduction of the Modulation Spectrogram for Speaker Verification Tomi Kinnunen Speech and Image Processing Unit Department of Computer Science University of Joensuu, Finland tkinnu@cs.joensuu.fi
More informationNoise Exposure History Interview Questions
Noise Exposure History Interview Questions 1. A. How often (never, rarely, sometimes, usually, always) did your military service cause you to be exposed to loud noise(s) where you would have to shout to
More informationChapter 3. Data Transmission
Chapter 3 Data Transmission Reading Materials Data and Computer Communications, William Stallings Terminology (1) Transmitter Receiver Medium Guided medium (e.g. twisted pair, optical fiber) Unguided medium
More informationPerception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.
Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions
More informationSINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase and Reassigned Spectrum
SINOLA: A New Analysis/Synthesis Method using Spectrum Peak Shape Distortion, Phase Reassigned Spectrum Geoffroy Peeters, Xavier Rodet Ircam - Centre Georges-Pompidou Analysis/Synthesis Team, 1, pl. Igor
More informationExperiment One: Generating Frequency Modulation (FM) Using Voltage Controlled Oscillator (VCO)
Experiment One: Generating Frequency Modulation (FM) Using Voltage Controlled Oscillator (VCO) Modified from original TIMS Manual experiment by Mr. Faisel Tubbal. Objectives 1) Learn about VCO and how
More informationAccurate Delay Measurement of Coded Speech Signals with Subsample Resolution
PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,
More informationLecture 14: Source Separation
ELEN E896 MUSIC SIGNAL PROCESSING Lecture 1: Source Separation 1. Sources, Mixtures, & Perception. Spatial Filtering 3. Time-Frequency Masking. Model-Based Separation Dan Ellis Dept. Electrical Engineering,
More informationINTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013
INTRODUCTION TO DEEP LEARNING Steve Tjoa kiemyang@gmail.com June 2013 Acknowledgements http://ufldl.stanford.edu/wiki/index.php/ UFLDL_Tutorial http://youtu.be/ayzoubkuf3m http://youtu.be/zmnoatzigik 2
More informationDiscrete Fourier Transform, DFT Input: N time samples
EE445M/EE38L.6 Lecture. Lecture objectives are to: The Discrete Fourier Transform Windowing Use DFT to design a FIR digital filter Discrete Fourier Transform, DFT Input: time samples {a n = {a,a,a 2,,a
More informationVerus. Khalid Alqinyah, Muhsin Gurel, Michael Mullen, Richard Tran, Phil Weber
Verus Khalid Alqinyah, Muhsin Gurel, Michael Mullen, Richard Tran, Phil Weber Schizophrenia A life long mental disorder involving a breakdown in relation between thought and emotion that leads to a faulty
More informationChapter IV THEORY OF CELP CODING
Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,
More informationAn Analysis of Image Denoising and Restoration of Handwritten Degraded Document Images
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 12, December 2014,
More informationSpectral Noise Tracking for Improved Nonstationary Noise Robust ASR
11. ITG Fachtagung Sprachkommunikation Spectral Noise Tracking for Improved Nonstationary Noise Robust ASR Aleksej Chinaev, Marc Puels, Reinhold Haeb-Umbach Department of Communications Engineering University
More informationAutomatic classification of traffic noise
Automatic classification of traffic noise M.A. Sobreira-Seoane, A. Rodríguez Molares and J.L. Alba Castro University of Vigo, E.T.S.I de Telecomunicación, Rúa Maxwell s/n, 36310 Vigo, Spain msobre@gts.tsc.uvigo.es
More informationApplication Note. GE Grid Solutions. Multilin 8 Series 869 Broken Rotor Bar Detection. Introduction
GE Grid Solutions Multilin 8 Series 869 Broken Rotor Bar Detection Application Note GE Publication Number: GET-20061 Copyright 2018 GE Multilin Inc. Introduction The Multilin 869 motor protection relay
More informationYou know about adding up waves, e.g. from two loudspeakers. AUDL 4007 Auditory Perception. Week 2½. Mathematical prelude: Adding up levels
AUDL 47 Auditory Perception You know about adding up waves, e.g. from two loudspeakers Week 2½ Mathematical prelude: Adding up levels 2 But how do you get the total rms from the rms values of two signals
More informationThe Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments
The Munich 2011 CHiME Challenge Contribution: BLSTM-NMF Speech Enhancement and Recognition for Reverberated Multisource Environments Felix Weninger, Jürgen Geiger, Martin Wöllmer, Björn Schuller, Gerhard
More informationCHAPTER 2 FIR ARCHITECTURE FOR THE FILTER BANK OF SPEECH PROCESSOR
22 CHAPTER 2 FIR ARCHITECTURE FOR THE FILTER BANK OF SPEECH PROCESSOR 2.1 INTRODUCTION A CI is a device that can provide a sense of sound to people who are deaf or profoundly hearing-impaired. Filters
More informationPROBLEM SET 6. Note: This version is preliminary in that it does not yet have instructions for uploading the MATLAB problems.
PROBLEM SET 6 Issued: 2/32/19 Due: 3/1/19 Reading: During the past week we discussed change of discrete-time sampling rate, introducing the techniques of decimation and interpolation, which is covered
More informationTemporal resolution AUDL Domain of temporal resolution. Fine structure and envelope. Modulating a sinusoid. Fine structure and envelope
Modulating a sinusoid can also work this backwards! Temporal resolution AUDL 4007 carrier (fine structure) x modulator (envelope) = amplitudemodulated wave 1 2 Domain of temporal resolution Fine structure
More informationCepstrum alanysis of speech signals
Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP
More informationRecommender systems and the Netflix prize. Charles Elkan. January 14, 2011
Recommender systems and the Netflix prize Charles Elkan January 14, 2011 Solving the World's Problems Creatively Recommender systems We Know What You Ought To Be Watching This Summer We re quite curious,
More informationMonaural and Binaural Speech Separation
Monaural and Binaural Speech Separation DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction CASA approach to sound separation Ideal binary mask as
More informationUsing a Game Development Platform to Improve Advanced Programming Skills
Journal of Reviews on Global Economics, 2017, 6, 328-334 328 Using a Game Development Platform to Improve Advanced Programming Skills Banyapon Poolsawas 1 and Winyu Niranatlamphong 2,* 1 Department of
More informationEqualizers. Contents: IIR or FIR for audio filtering? Shelving equalizers Peak equalizers
Equalizers 1 Equalizers Sources: Zölzer. Digital audio signal processing. Wiley & Sons. Spanias,Painter,Atti. Audio signal processing and coding, Wiley Eargle, Handbook of recording engineering, Springer
More informationAdaptive Selection of Embedding. Spread Spectrum Watermarking of Compressed Audio
Adaptive Selection of Embedding Locations for Spread Spectrum Watermarking of Compressed Audio Alper Koz and Claude Delpha Laboratory Signals and Systems Univ. Paris Sud-CNRS-SUPELEC SUPELEC Outline Introduction
More informationActivities on Beam Orbit Stabilization at BESSY II
Activities on Beam Orbit Stabilization at BESSY II J. Feikes, K. Holldack, P. Kuske, R. Müller BESSY Berlin, Germany IWBS`02 December 2002 Spring 8 BESSY: Synchrotron Radiation User Facility BESSY II:
More informationAuditory modelling for speech processing in the perceptual domain
ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract
More informationA Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification
A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification Wei Chu and Abeer Alwan Speech Processing and Auditory Perception Laboratory Department
More information