Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Similar documents
Different Approaches of Spectral Subtraction Method for Speech Enhancement

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Enhancement of Speech in Noisy Conditions

Speech Signal Enhancement Techniques

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

Available online at ScienceDirect. Procedia Computer Science 54 (2015 )

Chapter 4 SPEECH ENHANCEMENT

Estimation of Non-stationary Noise Power Spectrum using DWT

Audio Restoration Based on DSP Tools

Modulation Domain Spectral Subtraction for Speech Enhancement

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Enhancement of Speech Communication Technology Performance Using Adaptive-Control Factor Based Spectral Subtraction Method

Phase estimation in speech enhancement unimportant, important, or impossible?

Different Approaches of Spectral Subtraction method for Enhancing the Speech Signal in Noisy Environments

Speech Enhancement for Nonstationary Noise Environments

Analysis Modification synthesis based Optimized Modulation Spectral Subtraction for speech enhancement

Speech Enhancement in Noisy Environment using Kalman Filter

REAL-TIME BROADBAND NOISE REDUCTION

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

NOISE ESTIMATION IN A SINGLE CHANNEL

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Enhancement of Speech Signal by Adaptation of Scales and Thresholds of Bionic Wavelet Transform Coefficients

RECENTLY, there has been an increasing interest in noisy

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Carrier Frequency Offset Estimation in WCDMA Systems Using a Modified FFT-Based Algorithm

Chapter 3. Speech Enhancement and Detection Techniques: Transform Domain

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

Adaptive Speech Enhancement Using Partial Differential Equations and Back Propagation Neural Networks

Online Version Only. Book made by this file is ILLEGAL. 2. Mathematical Description

Noise Estimation based on Standard Deviation and Sigmoid Function Using a Posteriori Signal to Noise Ratio in Nonstationary Noisy Environments

Speech Enhancement Based On Noise Reduction

Comparative Performance Analysis of Speech Enhancement Methods

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

Discrete Fourier Transform (DFT)

Measuring the complexity of sound

Speech Enhancement Based on Non-stationary Noise-driven Geometric Spectral Subtraction and Phase Spectrum Compensation

Introduction of Audio and Music

Adaptive Noise Reduction Algorithm for Speech Enhancement

Speech Enhancement in a Noisy Environment Using Sub-Band Processing

SPEECH ENHANCEMENT WITH SIGNAL SUBSPACE FILTER BASED ON PERCEPTUAL POST FILTERING

Keywords Decomposition; Reconstruction; SNR; Speech signal; Super soft Thresholding.

SPECTRAL COMBINING FOR MICROPHONE DIVERSITY SYSTEMS

Denoising Of Speech Signal By Classification Into Voiced, Unvoiced And Silence Region

Auditory modelling for speech processing in the perceptual domain

Quality Estimation of Alaryngeal Speech

PROSE: Perceptual Risk Optimization for Speech Enhancement

Effective post-processing for single-channel frequency-domain speech enhancement Weifeng Li a

Advances in Applied and Pure Mathematics

Speech Enhancement using Wiener filtering

PARAMETER ESTIMATION OF CHIRP SIGNAL USING STFT

PERFORMANCE ANALYSIS OF SPEECH SIGNAL ENHANCEMENT TECHNIQUES FOR NOISY TAMIL SPEECH RECOGNITION

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Speech Enhancement Using LPC Analysis-A Review

Evaluation of clipping-noise suppression of stationary-noisy speech based on spectral compensation

International Research Journal of Engineering and Technology (IRJET) e-issn: Volume: 03 Issue: 12 Dec p-issn:

EE482: Digital Signal Processing Applications

FFT analysis in practice

Voice Excited Lpc for Speech Compression by V/Uv Classification

EXPERIMENTAL INVESTIGATION INTO THE OPTIMAL USE OF DITHER

Original Research Articles

Noise estimation and power spectrum analysis using different window techniques

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Architecture for Canonic RFFT based on Canonic Sign Digit Multiplier and Carry Select Adder

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Wavelet Speech Enhancement based on the Teager Energy Operator

ROBUST echo cancellation requires a method for adjusting

L19: Prosodic modification of speech

GUI Based Performance Analysis of Speech Enhancement Techniques

Analysis of LMS Algorithm in Wavelet Domain

Implementation of SYMLET Wavelets to Removal of Gaussian Additive Noise from Speech Signal

Speech Enhancement Techniques using Wiener Filter and Subspace Filter

An Adaptive Adjacent Channel Interference Cancellation Technique

Available online at ScienceDirect. Anugerah Firdauzi*, Kiki Wirianto, Muhammad Arijal, Trio Adiono

Audio Fingerprinting using Fractional Fourier Transform

Power Function-Based Power Distribution Normalization Algorithm for Robust Speech Recognition

Classification of ships using autocorrelation technique for feature extraction of the underwater acoustic noise

FPGA implementation of DWT for Audio Watermarking Application

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

A Two-Step Adaptive Noise Cancellation System for Dental-Drill Noise Reduction

Ensemble Empirical Mode Decomposition: An adaptive method for noise reduction

CMOS Design of Wideband Inductor-Less LNA

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

ICA & Wavelet as a Method for Speech Signal Denoising

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Monophony/Polyphony Classification System using Fourier of Fourier Transform

Speech Synthesis using Mel-Cepstral Coefficient Feature

8.3 Basic Parameters for Audio

Understanding Digital Signal Processing

Modified Kalman Filter-based Approach in Comparison with Traditional Speech Enhancement Algorithms from Adverse Noisy Environments

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

IMPROVEMENT OF SPEECH SOURCE LOCALIZATION IN NOISY ENVIRONMENT USING OVERCOMPLETE RATIONAL-DILATION WAVELET TRANSFORMS

Aparna Tiwari, Vandana Thakre, Karuna Markam Deptt. Of ECE,M.I.T.S. Gwalior, M.P, India

M.Tech Student, Asst Professor Department Of Eelectronics and Communications, SRKR Engineering College, Andhra Pradesh, India

Local Oscillators Phase Noise Cancellation Methods

Transcription:

IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction Supriya.P.Sarvade, Dr.Shridhar.K (PG Student, Department of Electronics & Communication Engineering, Basaveshwar Engineering College, Bagalkot, Karnataka, India) (Professor, Department of Electronics and Communication Engineering, Basaveshwar Engineering College, Bagalkot, Karnataka, India) Abstract : This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS. Keywords - Multi-Band Spectral Subtraction, Spectral Flatness Measure, Speech enhancement, SFM, MBSS. I. Introduction Speech is often corrupted by background noise which leads to many negative effects when processing a degraded speech signal. Hearing Aids supported by speech enhancement algorithms help hearing loss people in understanding speech in various noisy environments [7] and lots of research is being carried out in this direction. Speech intelligibility and quality are very important for hearing loss people and can be improved by speech enhancement techniques [7,8]. The spectral subtraction method proposed by Boll [5] is a well-known single channel speech enhancement technique [,,]. Wherein, basically an estimate of noise spectrum is subtracted from noisy speech spectrum to obtain an estimate of clean speech. An estimate of background noise spectrum is used to locate the regions possessing energy level higher than background noise. Higher energy in these regions will be either due to speech or else due to high energy noise components. From instantaneous energy alone, it is not possible to distinguish the two possibilities. Hence convectional SNR based rule fails to differentiate weather the high energy level in the bins is due to speech or due to noise components. For this reason an effort has been made in this paper to exploit a spectral domain feature, Spectral Flatness Measure to discriminate between speech component and noise component. Tone has more peaks and valleys in its spectrum in comparison to flat spectrum of white noise. Since white noise has flat spectrum, hence one way to determine if the sound is tone or noise is by measuring how flat is its spectrum, which is given by SFM. Experimental results of enhanced speech obtained from proposed model show that signal possess better noise cancellation with improved intelligibility and perceptual quality than traditional SNR based MBSS. II. Spectral Flatness Measure (SFM) Spectral flatness [6] or tonality coefficient is the ratio of geometric mean to the arithmetic mean of the power spectrum. Arithmetic mean is average or mean of N sequences whereas geometric mean is Nth root of their products. Therefore SFM is given as: where x(n) is magnitude of bin number n. If power spectrum is flat (i.e. constant), then its arithmetic and geometric means are equal and hence SFM becomes equal to one. For a sharp spectrum, one or two components will be one s and rest all zero, making geometric mean zero intern value of SFM becomes zero. Hence value of SFM is zero for pure tone and is one for white noise. Usually SFM is measured on logarithmic scale and hence its values lie between - and. DOI:.979/4-7446 www.iosrjournals.org 4 Page

Alpha Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction III. Proposed SFM Based Multi-Band Spectral Subtraction Multi-band spectral subtraction, proposed by Kamath [4] is the simplest way of removing background noise. It is very hard for any of the speech enhancement algorithms to perform homogeneously over all types of noise [] and hence algorithms are built under certain assumptions. Spectral subtraction assumes that noise is additive and uncorrelated with the speech signal and an estimate of noise is subtracted from the noisy speech signal to obtain estimate of clean speech. Noisy speech signal can be represented as sum of clean speech and noise as: () where x(n) is clean speech and d(n) is noise. Since speech signal is non-stationary and changes rapidly, it is divided into smaller frames using windowing techniques where each frame seems to be constant allowing us to apply Short Time Fourier Transform (STFT) for further processing. Hamming window is preferred over rectangular for its smoothness at the edges which reduces distortion. Neglecting cross spectral terms which is the product of noise and clean speech spectral terms [9], power spectrum of noisy speech signal can be approximately given as: () where is the magnitude spectrum of clean speech and is magnitude spectrum of noise. An estimate of clean speech can be given as: (4) Considering the practical fact that a colored noise corrupts the speech signal, multiband spectral subtraction is implemented wherein each frame is divided into M bands of equal lengths and the amount of noise subtracted from each band is decided by a weighting factor i. An estimate of clean speech of i th band is given as: (5) Improved spectral subtraction proposed by Berouti [] where the resulted spectrum was prevented from going below spectral floor (minimum level) is given as: where the value of is chosen to be. In the proposed model weighting factor i is driven by a noise-speech discriminating parameter, SFM than traditional Signal to Noise ratio. SFM in db can be given as: whereg m and A m are geometric and arithmetic means of power spectrum respectively. This paper proposes an empirical relationship between SFM and weighting factor.for speech signal SFM of -6 db represents a pure tone and a minimum value of noise power should be subtracted from the input noisy signal, hence a small value of = was chosen till SFM = -4dB as shown in Fig.. Whereas SFM of db represents complete noise and hence a maximum value of =.5 was chosen. Applying a second order polynomial fit for the above data points, a relation between SFM and weighting factor of i th band can be given as: (8).5 Relationship between SFM and Alpha y =.6*x +.6*x +.5 data quadratic (7) (6).5.5-6 -5-4 - - - SFM in db Fig.. Relationship between SFM and weighting factor DOI:.979/4-7446 www.iosrjournals.org 4 Page

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction IV. Block Diagram of Proposed Model Block diagram of the proposed model is as shown in Fig.. Fig.. Block diagram of proposed SFM based MBSS The proposed SFM based MBSS can be implemented by following steps: ) Initially speech signal is windowed. Since speech is a long signal, successive windows each of ms are taken along the length of the signal with an overlap of 5% so that the deemphasized part of one window becomes middle of the next window. ) 4 point Fast Fourier Transform (FFT) is computed on each frame which decomposes the signal into its magnitude and phase. FFT is a technique proposed by [4] that computes coefficients of a Discrete Fourier Series faster than ever it was possible [,]. ) Average noise spectrum is computed from speech pause periods. In the proposed work average of first 5 frames i.e. ms is considered as estimate of noise power. 4) Multiband concept is implemented by subdividing each frame into 6 bands of equal length. 5) SFM of each band is computed using equation 7. 6) In [,5] it is revealed that amplitude is more important than the phase information for the quality and intelligibility of speech and hence in proposed model phase of the signal is kept unchanged. An estimate of clean speech is obtained by subtracting an estimate of noise power from each band of noisy speech magnitude as a function of weighting factor using equations 6 and 8. 7) Estimate of clean speech magnitude is combined with the undisturbed phase and then is transformed to time domain by obtaining Inverse Fast Fourier Transform (IFFT). 8) Reverse process of framing is done using Overlap and Add (OLA) method and enhanced speech is obtained. DOI:.979/4-7446 www.iosrjournals.org 4 Page

MOS MOS MOS Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction V. Results and Analysis Proposed speech enhancement algorithm has been tested on different types of noisy speech samplestaken from NOIZEUS speech database. Performance evaluation of the system is done using both spectrogram analysis and subjective listening tests. Mean Opinion Score (MOS) for different types of noise: MOS is a measure of representing overall quality of the system. On a predefined scale of to 5 subjects were asked to rate over the performance of the system, where representing the lowest quality and 5 representing highest quality. 4.5.5.5.5 Noisy speech SNR based MBSS SFM based MBSS.75.5.5..75.5.5.5 AWGN Noise db 5dB db 5dB SNR Fig.. MOS for Additive White Gaussian Noise (AWGN) 5 4 Street Noise Noisy speech SNR based MBSS SFM based MBSS.7.5.5.65.8.7.75.5.8 4.5 4.5.5.5.5 db 5dB db 5dB SNR Fig. 4. MOS for street noise Babble Noise Noisy speech SNR based MBSS SFM based MBSS.6.8.9.95.9.7.8.7.5 db 5dB db 5dB SNR Fig. 5. MOS for babble noise Fig.,4,5 shows MOS computed from listening tests for different kinds of noise sources with different SNR levels. It is observed that MOS decreases with increase in SNR of the noisy speech samples. It is also evident that MOS for proposed model is more than the traditional SNR based model. DOI:.979/4-7446 www.iosrjournals.org 44 Page

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction Spectrogram Analysis for different types of noise: Fig. 5. Spectrograms for AWGN (a) db SNR noisy speech; (b),(c)enhanced speech obtained using convectional SNR based rule and proposed SFM based rule respectively. Fig. 6. Spectrograms for Street Noise (a) db SNR noisy speech; (b),(c)enhanced speech obtained using convectional SNR based rule and proposed SFM based rule respectively. Fig. 7. Spectrograms for Babble Noise (a) db SNR noisy speech; (b),(c)enhanced speech obtained using convectional SNR based rule and proposed SFM based rule respectively. DOI:.979/4-7446 www.iosrjournals.org 45 Page

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction From spectrogram analysis of input noisy speech, enhanced speech obtained from traditional SNR based MBSS and enhanced speech obtained from proposed SFM based MBSS, it is evident that the performance of proposed model is superior that the existing SNR based model. Performance of proposed model is best for Additive White Gaussian Noise since model was designed under the assumption that additive noise corrupts the speech signal and performance of the proposed model decreases for babble noise since the frequency and characteristics of babble noise are very similar to the speech signal of interest. VI. Conclusion This paper intended to preserve the perceptual quality of speech by exploiting one of the spectral characteristic of noise called SFM. From results and analysis it can be concluded that the performance of proposed SFM based MBSS is superior than the traditional SNR based MBSS. Proposed model proved to have better noise cancellation preserving perceptual quality of the speech signal with minimum distortion and musical noise is nearly inaudible. References [] M. Berouti, R. Schwartz, J. Makhoul, Enhancement of speech corrupted by acoustic noise, Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.,pp. 8, April 979. [] C.-T. Lin, Single-channel speech enhancement in variable noise-level environment, IEEE Trans. Syst. Man Cybernet. A () () 7 4. [] Radu Mihnea Udrea, Nicolae D. Vizireanu, Silviu Ciochina, An improved spectral subtraction method for speech enhancement using a perceptual weighting filter, Elsevier Digital Signal Processing 8, pp. 58-587, Aug 7. [4] S. Kamath, and P. C. Loizou, A multi-band spectral subtraction method for enhancing speech corrupted by colored noise, in Proceedings of Int. Conf. on Acoustics, Speech, and Signal Processing, Orlando, USA, May, vol. 4, pp. 46 464. [5] S.F. Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech. [6] GRAY, A.H., and MARKEL, J.D. A spectral-flatness measure for studying the autocorrelation method of linear prediction of speech analysis, IEEE Trans. Acoust. Speech Signal Process., 974,, pp. 7 7. [7] Dr. (Smt). S.D. Apte and Shridhar, Speech Enhancement in Hearing Aids Using Conjugate Symmetry of DFT and SNR-Perception Models, International Journal of Computer Applications, vol.,no., pp. 44-5,. [8] Dr. (Mrs). S.D. Apte, Shridhar, Speech Enhancement in Hearing Aids Using Conjugate Symmetry Proprety of Short Time Fourier Transform, International Journal of Recent Trends in Engineering, vol., no. 5, pp. 46-5, November 9. [9] Soumya Jolad, Shridhar, Speech Enhancement Using Spectral Subtraction Technique with Minimized Cross Spectral Components, International Journal of Research in Engineering and Technology, vol. 5, no., pp. 97-, March 6. [] Supriya.P.Sarvade, Dr.Shridhar. K and Varun.P.Sarvade, Multi-Band Spectral Subtraction for Speech Enhancement Using Sine Multitaper, IOSR Journal of VLSI and Signal Processing,vol. 6, issue 6, ver. II, pp. 7-76, Nov.-Dec. 6. [] Supriya.P.Sarvade, Dr.Shridhar. K and Varun.P.Sarvade, Radix- DIT-FFT Algorithm for Real Valued Sequence, International Journal of Emerging Trends in Science and Technology, vol., issue, pp. 54-56, Feb. 6. [] Supriya.P.Sarvade, Dr.Shridhar. K and Varun.P.Sarvade, Time Efficient Structure for DFT Filter Bank, International Journal of Emerging Trends in Science and Technology, vol., issue, pp. 479-4794, Nov. 6. [] J. S. Lim and A. V. Oppenheim, Enhancement and Bandwidth Compression of Noisy Speech, Proceedings of the IEEE, vol. 67, pp. 586 64, (979). [4] W. Cooley and J. W. Tukey, "An algorithm for the machine calculation of complex Fourier series," Math. Coinput, vol. 9, pp.97, 965. [5] P. C. Loizou, Speech Enhancement: Theory and Practice, Ist ed. Taylor and Francis, (7). DOI:.979/4-7446 www.iosrjournals.org 46 Page