Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa

Similar documents
Speech Enhancement for Nonstationary Noise Environments

SPEECH MEASUREMENTS USING A LASER DOPPLER VIBROMETER SENSOR: APPLICATION TO SPEECH ENHANCEMENT

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Robust Low-Resource Sound Localization in Correlated Noise

Speech Signal Enhancement Techniques

Signal Processing 91 (2011) Contents lists available at ScienceDirect. Signal Processing. journal homepage:

IN REVERBERANT and noisy environments, multi-channel

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Noise Estimation based on Standard Deviation and Sigmoid Function Using a Posteriori Signal to Noise Ratio in Nonstationary Noisy Environments

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Joint dereverberation and residual echo suppression of speech signals in noisy environments Habets, E.A.P.; Gannot, S.; Cohen, I.; Sommen, P.C.W.

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Different Approaches of Spectral Subtraction Method for Speech Enhancement

International Journal of Advanced Research in Computer Science and Software Engineering

Noise Tracking Algorithm for Speech Enhancement

REAL-TIME BROADBAND NOISE REDUCTION

Estimation of Non-stationary Noise Power Spectrum using DWT

MULTICHANNEL systems are often used for

Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled Recursive Averaging

Real time noise-speech discrimination in time domain for speech recognition application

NOISE POWER SPECTRAL DENSITY MATRIX ESTIMATION BASED ON MODIFIED IMCRA. Qipeng Gong, Benoit Champagne and Peter Kabal

Performance Evaluation of Noise Estimation Techniques for Blind Source Separation in Non Stationary Noise Environment

Mikko Myllymäki and Tuomas Virtanen

NOISE ESTIMATION IN A SINGLE CHANNEL

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS

SPEECH ENHANCEMENT BASED ON A LOG-SPECTRAL AMPLITUDE ESTIMATOR AND A POSTFILTER DERIVED FROM CLEAN SPEECH CODEBOOK

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Automotive three-microphone voice activity detector and noise-canceller

Noise Reduction: An Instructional Example

Speech Enhancement Based On Noise Reduction

Chapter 4 SPEECH ENHANCEMENT

High-speed Noise Cancellation with Microphone Array

Performance Analysis of Feedforward Adaptive Noise Canceller Using Nfxlms Algorithm

Modulation Classification based on Modified Kolmogorov-Smirnov Test

CHAPTER 4 VOICE ACTIVITY DETECTION ALGORITHMS

SIGNAL DETECTION IN NON-GAUSSIAN NOISE BY A KURTOSIS-BASED PROBABILITY DENSITY FUNCTION MODEL

Auditory System For a Mobile Robot

EMD BASED FILTERING (EMDF) OF LOW FREQUENCY NOISE FOR SPEECH ENHANCEMENT

Single channel noise reduction

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Speech Enhancement using Wiener filtering

IMPROVED SPEECH QUALITY FOR VMR - WB SPEECH CODING USING EFFICIENT NOISE ESTIMATION ALGORITHM

THE problem of acoustic echo cancellation (AEC) was

Wavelet Speech Enhancement based on the Teager Energy Operator

Transient noise reduction in speech signal with a modified long-term predictor

546 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 4, MAY /$ IEEE

EE482: Digital Signal Processing Applications

PROSE: Perceptual Risk Optimization for Speech Enhancement

JOINT NOISE AND MASK AWARE TRAINING FOR DNN-BASED SPEECH ENHANCEMENT WITH SUB-BAND FEATURES

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Dual-Microphone Speech Dereverberation in a Noisy Environment

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

Power Line Interference Removal from ECG Signal using Adaptive Filter

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

Live multi-track audio recording

Optimal Simultaneous Detection and Signal and Noise Power Estimation

ROBUST echo cancellation requires a method for adjusting

Evaluation of clipping-noise suppression of stationary-noisy speech based on spectral compensation

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

An Adaptive Threshold Detector and Channel Parameter Estimator for Deep Space Optical Communications

HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM

A Correlation-Maximization Denoising Filter Used as An Enhancement Frontend for Noise Robust Bird Call Classification

Recent Advances in Acoustic Signal Extraction and Dereverberation

Local Relative Transfer Function for Sound Source Localization

24 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 1, JANUARY /$ IEEE

Systematic Integration of Acoustic Echo Canceller and Noise Reduction Modules for Voice Communication Systems

A Comparison of the Convolutive Model and Real Recording for Using in Acoustic Echo Cancellation

STATISTICAL METHODS FOR THE ENHANCEMENT OF NOISY SPEECH. Rainer Martin

A NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER. Axel Röbel. IRCAM, Analysis-Synthesis Team, France

Comparative Study of Different Algorithms for the Design of Adaptive Filter for Noise Cancellation

Multiple Sound Sources Localization Using Energetic Analysis Method

Adaptive Noise Reduction of Speech. Signals. Wenqing Jiang and Henrique Malvar. July Technical Report MSR-TR Microsoft Research

AS DIGITAL speech communication devices, such as

Epoch Extraction From Emotional Speech

Can binary masks improve intelligibility?

Dynamic thresholding for automated analysis of bobbin probe eddy current data

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Noise Plus Interference Power Estimation in Adaptive OFDM Systems

Detection Algorithm of Target Buried in Doppler Spectrum of Clutter Using PCA

COMB-FILTER FREE AUDIO MIXING USING STFT MAGNITUDE SPECTRA AND PHASE ESTIMATION

Acoustic Echo Cancellation using LMS Algorithm

arxiv: v1 [cs.sd] 4 Dec 2018

Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks

Dual Transfer Function GSC and Application to Joint Noise Reduction and Acoustic Echo Cancellation

A FEEDFORWARD ACTIVE NOISE CONTROL SYSTEM FOR DUCTS USING A PASSIVE SILENCER TO REDUCE ACOUSTIC FEEDBACK

SINUSOIDAL MODELING. EE6641 Analysis and Synthesis of Audio Signals. Yi-Wen Liu Nov 3, 2015

ESTIMATION OF TIME-VARYING ROOM IMPULSE RESPONSES OF MULTIPLE SOUND SOURCES FROM OBSERVED MIXTURE AND ISOLATED SOURCE SIGNALS

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Phase estimation in speech enhancement unimportant, important, or impossible?

IMPROVEMENT OF SPEECH SOURCE LOCALIZATION IN NOISY ENVIRONMENT USING OVERCOMPLETE RATIONAL-DILATION WAVELET TRANSFORMS

Speech Enhancement in Noisy Environment using Kalman Filter

The Role of High Frequencies in Convolutive Blind Source Separation of Speech Signals

A SUPERVISED SIGNAL-TO-NOISE RATIO ESTIMATION OF SPEECH SIGNALS. Pavlos Papadopoulos, Andreas Tsiartas, James Gibson, and Shrikanth Narayanan

POLYPHONIC PITCH DETECTION BY MATCHING SPECTRAL AND AUTOCORRELATION PEAKS. Sebastian Kraft, Udo Zölzer

Transcription:

Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa Spring 2008

Introduction Problem Formulation Possible Solutions Proposed Algorithm Experimental Results Conclusions

Digital still cameras are widely used for video and audio recordings. When activating the zoom lens-motor during these recordings, the noise generated by the motor may be recorded by the camera's microphone. This noise may be extremely annoying and significantly degrade the perceived quality and intelligibility of the desired signal.

Introduction cont.

s t Let x( n), d ( n), d ( n) denote the speech signal, background stationary noise, and zoom motor (nonstationary) noise, respectively. Let signal. s t y( n) x( n) d ( n) d ( n) be the microphone Main goal: to derive an estimator clean speech signal. xn ˆ( ) camera microphone yn ( ) xn ˆ( ) x s d t d for the

To solve this problem, many digital-cameras manufacturers disable the option of activating the lens motor during audio recordings. Adaptive solution Add a reference microphone and implement an adaptive algorithm for cancelling the motor noise in real-time. Spectral enhancement Using spectral enhancement techniques for estimating the motor noise spectrum and enhancing the speech signal.

Spectral Enhancement Techniques The spectral enhancement approach is operated on the time-frequency domain. Let the observed signal be: y( n) x( n) d( n) The goal is to estimate the spectral coefficient of the speech signal. Let xn ( ) be the short time Fourier transform (STFT) of, i.e., 2 - j km N X w( ll - m) x( m) e X m

Spectral Enhancement Techniques cont. The desired estimate of is : X G Y where the gain function minimizing a cost-function: is achieved by There are different ways to measure the distortion function. The commonly used distortion functions are: ˆ X G 2 d X, Xˆ X Xˆ 2 or, ˆ log log ˆ 2 d X X X X ˆ arg min E d X, Xˆ G

Spectral Enhancement Techniques cont. The disadvantage of the above mentioned algorithms, is their difficulty to handle with highly non-stationary noises. Input Signal OMLSA Only

The algorithm is based on paper: A., Abramson, I., Cohen, Enhancement of Speech Signals Under Multiple Hypotheses using an Indicator for Transient Noise Presence, 2007 Since the problem consists of 2 different types of noises, the definition of the observed signal is: s t y( n) x( n) d ( n) d ( n) And d s s X, Y, D, D t t ( n), d ( n) accordingly. are the STFT of x( n), y( n),

Since the motor noise not always present, we define the following 4 hypothesis: H : Y X D s 1s H : Y X D D s t 1t H : Y D s 0s H : Y D D s t 0t H1 H0 : speech is more dominant than noise. : noise is more dominant than speech.

Let j, j 0,1 denote the detector decision in the time-frequency bin, : 0 1 transient is a noise component transient is a speech component Let C10, C01 denote the cost of false-alarm / missdetections, respectively. The algorithm assumes an indicator signal for the motor noise in the time frame l. Indicator

Let A X, R Y. The criterion for the estimation of the speech signal under the decision where Aˆ j Aˆ arg m in C p H H, Y : 1 j 1s 1t j E d X, Aˆ Y, H H 1s 1t,, ˆ j C p H H Y d G R A 0 j 0s 0t min 2 d( x, y) log x log y.

Based on above definitions, the gain function is defined : Aˆ G (, ) Y 1a where G (, ) j G G LSA Y 2 s, t, x, s, t, j min (, ) :a-posteriorisnr :a-priorisnr When no motor noise exists (indicator= 0 ), we will use the conventional OMLSA: a P( H ). 1 a

t D s D Y Xˆ ISTFT xn ˆ( ) X G j, gain func. computation

MCRA ˆ ds Speech varience ˆx estimate Y Motor Noise Estimate ˆ dt Comp. G j, Probability P H 1 Estimator G min computation G min

Parameters Setup: Several SNR s of motor noise and speech were experimented. For each recording several values were considered. Different parameter sets were tried out until the optimized ones were found. The performance of the proposed approach was compared to those of the conventional OMLSA. G f

Gf=-15dB Gf=-20dB Input Signal OMLSA Only

Gf=-15dB Gf=-25dB Input Signal OMLSA Only

Gf=-12dB Gf=-20dB Input Signal

Gf=-15dB Gf=-25dB Input Signal

Gf=-15dB Gf=-25dB Input Signal

Gf=-15dB Gf=-25dB Input Signal

Gf=-15dB Gf=-20dB Input Signal

An algorithm for suppressing lens motor noise has been introduced. An optimal estimator, is derived, while assuming some indicator for the motor-noise presence in the time domain. A-priori motor noise spectrum estimate is acquired. A substantial suppression of the motor noise is achieved, without degrading the perceived quality of the desired signal. The proposed algorithm is computationally efficient.

The Signal & Image processing lab for technical support during the entire work process. The Control & Robotics lab for assistance with assembling of the camera module together with an I/O control card. For all the guidance and academic support by Kuti Avargel.

I. Cohen and B. Berdugo, Speech Enhancement for Non- Stationary Noise Environments, Signal Processing, Vol. 81, No. 11, pp. 2403-2418, Nov. 2001. I. Cohen and B. Berdugo, Noise estimation by minima controlled recursive averaging for robust speech enhancement, Signal Processing, Vol. 9, Issue 1, pp. 12 15, Jan 2002. A. Abramson and I. Cohen," Enhancement of Speech Signals Under Multiple Hypotheses Using an Indicator for Transient Noise Presence " Proc. 31th IEEE Internat. A., Abramson, I. Cohen, Simultaneous Detection and Estimation Approach for Speech Enhancement, Audio, Speech, and Language Processing, IEEE Transactions on Vol. 15, Issue 8, pp. 2348 2359, Nov. 2007.

The a-priori estimation for the motor noise is achieved using an average of early acquired recordings. 0 The algorithm updates the initial estimation according to pre-determined regions. The result is the desired : 1 k ˆ 0(, ) t H ˆ 0 : ˆ t l, 1 k H : ˆ, (, ) 1 ˆ ( l 1, k ) t 0 The noise is classified by the criteria: Motor noise level higher than speech level ˆt 2 ( l 1, k) 1 Y l, k s( l, ) t H. 0

Region classification: Method of classification: Frequencies that are out of speech band [>4 KHz ], are assumed to be in H. 0 High amplitude harmonies in the motor noise estimation are classified as as well. H 0 High amplitude harmonies are determined by an empiric threshold. The rest of the spectrum is classified as H. 1

In general the speech spectral estimation is calculated by subtracting the motor noise estimation and the background noise estimation from the observed signal. 2 2 ˆ max G ˆ ˆ ˆ 1, k, l 1 k Yl 1, k 1 Y, s t, Previousframe estimate Current frame estimate 2 xl, k LSA l, min

Using the MCRA algorithm the noise spectrum is estimated. Let ˆs, Let denote the conditional speech presence probability, therefore the update equation for is : where p ' be the noise spectrum estimation. 2 ˆ ( l 1, k) ( l, k) ˆ ( l, k) 1 ( l, k) Y l, k s d s d p l k ( l, k) 1 ',. d d d ˆs, Let Sr ( l, k) S l, k / Smin l, k denote the ratio between the local energy of the noisy signal and its derived minimum. The decision rule is: S ( l, k), threshold value. r H 1 H 0

In order to suppress the noise (stat. & transients) when speech is absence, minimizing the next equation yields the solution above: min arg min E Gmin s, t, Gf s, G G min Let denote the constant attenuation under speech absence: G min G f s, s, t,

Let 1 1 exp P H 1 qˆ 1 qˆ 1 qˆ( l, k) 1 P ( l, k) P ( l, k) P ( l) q Where is the estimator for the a-priori signal absence probability. ˆ local global frame qˆ is larger if either previous frames or recent neighboring frequency bins do not contain speech.