Frequency Domain Implementation of Advanced Speech Enhancement System on TMS320C6713DSK

Similar documents
MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

REAL-TIME BROADBAND NOISE REDUCTION

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Audio Restoration Based on DSP Tools

Audio Signal Compression using DCT and LPC Techniques

Acoustic Echo Cancellation using LMS Algorithm

Speech Enhancement Based On Noise Reduction

Available online at ScienceDirect. Anugerah Firdauzi*, Kiki Wirianto, Muhammad Arijal, Trio Adiono

GSM Interference Cancellation For Forensic Audio

Chapter IV THEORY OF CELP CODING

Automotive three-microphone voice activity detector and noise-canceller

EE482: Digital Signal Processing Applications

Review on Design & Realization of Adaptive Noise Canceller on Digital Signal Processor

SPECTRAL COMBINING FOR MICROPHONE DIVERSITY SYSTEMS

Enhancement of Speech Communication Technology Performance Using Adaptive-Control Factor Based Spectral Subtraction Method

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

RECENTLY, there has been an increasing interest in noisy

Auditory modelling for speech processing in the perceptual domain

CHAPTER 3 SPEECH ENHANCEMENT ALGORITHMS

Keywords Decomposition; Reconstruction; SNR; Speech signal; Super soft Thresholding.

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa

Enhancement of Speech in Noisy Conditions

Speech Enhancement using Wiener filtering

Chapter 4 SPEECH ENHANCEMENT

Online Version Only. Book made by this file is ILLEGAL. 2. Mathematical Description

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Speech Enhancement for Nonstationary Noise Environments

ROBUST echo cancellation requires a method for adjusting

NOISE ESTIMATION IN A SINGLE CHANNEL

Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech

QUANTIZATION NOISE ESTIMATION FOR LOG-PCM. Mohamed Konaté and Peter Kabal

Design and Implementation on a Sub-band based Acoustic Echo Cancellation Approach

Robust Low-Resource Sound Localization in Correlated Noise

Comparative Performance Analysis of Speech Enhancement Methods

Reducing comb filtering on different musical instruments using time delay estimation

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

A Three-Microphone Adaptive Noise Canceller for Minimizing Reverberation and Signal Distortion

GUI Based Performance Analysis of Speech Enhancement Techniques

Presentation Outline. Advisors: Dr. In Soo Ahn Dr. Thomas L. Stewart. Team Members: Luke Vercimak Karl Weyeneth. Karl. Luke

Chapter 3. Speech Enhancement and Detection Techniques: Transform Domain

Speech Compression for Better Audibility Using Wavelet Transformation with Adaptive Kalman Filtering

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC

Voice Activity Detection for Speech Enhancement Applications

Modulator Domain Adaptive Gain Equalizer for Speech Enhancement

Mikko Myllymäki and Tuomas Virtanen

HUMAN speech is frequently encountered in several

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

Speech Enhancement Techniques using Wiener Filter and Subspace Filter

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

Overview of Code Excited Linear Predictive Coder

Audio Fingerprinting using Fractional Fourier Transform

Performance Evaluation of STBC-OFDM System for Wireless Communication

Speech Enhancement in a Noisy Environment Using Sub-Band Processing

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Active Noise Cancellation System Using DSP Prosessor

Available online at ScienceDirect. Procedia Computer Science 54 (2015 )

A Computational Efficient Method for Assuring Full Duplex Feeling in Hands-free Communication

WIND SPEED ESTIMATION AND WIND-INDUCED NOISE REDUCTION USING A 2-CHANNEL SMALL MICROPHONE ARRAY

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

Performance Analysis of Acoustic Echo Cancellation in Sound Processing

THE problem of acoustic echo cancellation (AEC) was

BEAMFORMING WITHIN THE MODAL SOUND FIELD OF A VEHICLE INTERIOR

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

Speech Signal Enhancement Techniques

Single channel noise reduction

ZLS38500 Firmware for Handsfree Car Kits

Open Access Improved Frame Error Concealment Algorithm Based on Transform- Domain Mobile Audio Codec

Live multi-track audio recording

Digital Signal Processing of Speech for the Hearing Impaired

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

DOPPLER SHIFTED SPREAD SPECTRUM CARRIER RECOVERY USING REAL-TIME DSP TECHNIQUES

Sound Synthesis Methods

HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM

STUDY OF THE PERFORMANCE OF THE LINEAR AND NON-LINEAR NARROW BAND RECEIVERS FOR 2X2 MIMO SYSTEMS WITH STBC MULTIPLEXING AND ALAMOTI CODING

Performance Analysis of Feedforward Adaptive Noise Canceller Using Nfxlms Algorithm

Acoustic Change Detection Using Sources of Opportunity

arxiv: v1 [cs.sd] 4 Dec 2018

DESIGN AND IMPLEMENTATION OF AN ALGORITHM FOR MODULATION IDENTIFICATION OF ANALOG AND DIGITAL SIGNALS

A Parametric Model for Spectral Sound Synthesis of Musical Sounds

Journal of American Science 2015;11(7)

Speech Coding in the Frequency Domain

Acoustic echo cancellers for mobile devices

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Wavelet Speech Enhancement based on the Teager Energy Operator

ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE

EXTRACTING a desired speech signal from noisy speech

Acoustic Beamforming for Hearing Aids Using Multi Microphone Array by Designing Graphical User Interface

Implementation and Comparative analysis of Orthogonal Frequency Division Multiplexing (OFDM) Signaling Rashmi Choudhary

Impulsive Noise Reduction Method Based on Clipping and Adaptive Filters in AWGN Channel

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

SPEECH ENHANCEMENT WITH SIGNAL SUBSPACE FILTER BASED ON PERCEPTUAL POST FILTERING

Transcription:

Frequency Domain Implementation of Advanced Speech Enhancement System on TMS320C6713DSK Zeeshan Hashmi Khateeb Student, M.Tech 4 th Semester, Department of Instrumentation Technology Dayananda Sagar College of Engineering Bangalore, India Gopalaiah Research Scholar and Associate Professor Department of Instrumentation Technology Dayananda Sagar College of Engineering Bangalore, India Abstract-- Human speech communication typically takes place in complex acoustic backgrounds with environmental sound sources, competing voices and ambient noise. The presence of background noise can substantially degrade the speech communication system by reducing the quality of the signal, intelligibility and increase listener fatigue. This necessitates the need for noise reduction in the world of telecommunication and has become a subject of intense research in the recent years. The project focuses on the implementation of Speech Enhancement System by background noise suppression using (fixed point DSP processor based) TMS320C6713DSK. Speech enhancement system is developed using various algorithms in terms of signal-tonoise ratio and various other performance measures. The proposed algorithms continuously evaluate the noise with a noisy speech signal by using channel SNR estimator. This results in more precise SNR estimate available for gain calculation which further improves the quality of speech along with sufficient noise suppression. Software implementation of the algorithm is done with the help of MATLAB tool. The hardware implementation is performed using DSK for which coding is done in code composer studio. Listening tests are performed to determine the subjective quality and intelligibility of speech enhanced by this method. procedure also called speech enhancement algorithm is Historically, pre-processor single-channel speech enhancement algorithms have been considered in the context of robust speech coding, (see Fig. 2). These algorithms are designed to operate in an environment where only the noisy signal is available, and both facilitate the operation of the speech codec (coding and decoding) and improve the perceived sound quality at the end user. Acoustic background noise in mobile speech communication systems, while largely inevitable, can have a severely detrimental effect on speech intelligibility. Noise suppression is highly desirable in these systems. However, the process of reducing noise in a speech signal is associated with distortion of the processed signal, the severity of which is generally proportional to the amount of noise suppression In this project, TIA127-B compliant (Narrow Band Speech Enhancement System) Noise Suppression systems are to be simulated using MATLAB and is implemented on TMS320C6713 DSK to demonstrated in real time. I. INTRODUCTION In modern hands free speech communication environment, there often occurs a situation that the speech signal is superposed by background noise (see Fig.1). This is particular the case if the speaker is not located as close as possible to the microphone. The speech signal intensity decreases with growing distance to the microphone. It is even possible that background noise sources are captured at a higher level than the speech signal. The noise distorts the speech and words are hardly intelligible. In order to improve the intelligibility and reduce the listeners (FES) stress by increasing the signal to noise ratio a noise reduction Fig.1. Speech signal superposed by background noise In a single-channel application, the noise suppression algorithm needs an additional module for the estimation of the noise and clean speech statistics. The underlying idea in all these algorithms is that the noise statistics can be

estimated from the signal segments, either in the time or in the frequency domain, where the speech energy is either low, or the speech signal does not exist at all. The classical noise suppression scheme is based on the idea of spectral subtraction. It is widely used nowadays, mainly because of its simplicity. Spectral subtraction schemes are based on direct estimation of the short time spectral magnitude of clean speech. A drawback of this algorithm is the musical noise. Musical noise consists of tones with the same duration as the window length of algorithm and with a different set of frequencies for every frame. Musical noise is a result of variability in the power spectrum. naturalness. There are several methods for performing noise reduction, but all can be regarded as a kind of filtering. In our application, speech and noise are mixed to one signal channel. They reside in the same frequency band and may have similar correlation properties. Consequently the filtering will inevitably have an effect on both the speech and the noise. Therefore it is a very challenging task to distinguish between them. Sometimes speech components can be detected as noise and thus will be suppressed as well. Especially fricatives and plusives are attenuated due to their noise-like properties. Furthermore, the residual noise characteristics should preserve the characteristics of the background noise in the recording environment. Typical single channel noise reduction algorithms add a synthetic noise, also called Musical Noise., which sounds artificial and has a disturbing effect on the listener. Single channel noise reduction algorithms are based on the fact that the statistical properties of speech are only stationary over short periods of time whereas the noise often can be assumed to be stationary over much longer periods. Another aim for the algorithm design is the limitation of the signal delay because of its annoying effect in dialog situations. Fig 2. Configuration of Noise Suppression (NS) as a speech enhancement Pre-processor for speech codec II. PROBLEM STATEMENT In modern hands free speech communication environments often occurs the situation that the speech signal is superposed by background noise as shown in Fig 1.1. This is particular the case if the speaker is not located as close as possible to the microphone. The speech signal intensity decreases with growing distance to the microphone. It is even possible that background noise sources are captured at a higher level than the speech signal. The noise distorts the speech and words are hardly intelligible. In order to improve the intelligibility and reduce the listeners stress by increasing the signal to noise ratio a noise reduction procedure also called speech enhancement algorithm is III. NOISE REDUCTION PRINCIPLES The requirements of a noise reduction system in speech enhancement are: Naturalness and Intelligibility of the enhanced signal Improvement of signal-to-noise ratio Short signal delay Computational simplicity The quality of the enhanced signal is a diverse issue, it may be characterised by the terms intelligibility and The noise reduction algorithms can be split into two groups: time domain algorithms and those utilising some kind of transform, e.g. Fourier Transform. Whereas the filter calculation for time domain solutions generally relies on the usage of correlation estimates, there is a large variety of algorithms operating in the frequency domain. IV. NOISE SUPPRESSION SYSTEM The Noise suppression algorithm used by the EVRC is based on the Spectral Subtraction technique, in which the main emphasis is given on the Spectral Weighting.The Fig.3 shows the general principle of such a system. Fig.3. General principle of the EVRC NS system Firstly, the input signal, y(n), is block wise transformed from the time domain, to the frequency domain. Secondly, a set of gain factors, q(k), are calculated. The actual spectral subtraction takes the form of a multiplication of G(k) with the gain factors from the gain calculation, resulting in the enhanced spectrum Q(k). Lastly, this spectrum is transformed from the frequency domain, to the time domain, and the signal is block wise reassembled to form the enhanced output y NS(n).

A. TIA 127-B (Narrow Band) Speech Enhancement System The fundamental concept of a frequency domain solution is spectral weighting and block processing. The architecture of such a system is presented in Figure 4. Since in a single/multi channel approach the estimation of the noise and the weighting function can only be derived in frequency domain, the time domain input signal has to be transformed. The transformations are performed by means of standard analysis and synthesis systems operating on a frame-by-frame basis. It consists of three major components: the analysis/synthesis framework for time domain / frequency domain transformation the noise estimation the weighting function. If the noise estimation equals the disturbing noise spectrum the output signal spectrum Y(n, Ωi), will be very similar to the noiseless speech spectrum S(n, Ωi). Estimating the noise spectrum Nest(n, Ωi ) is one of the major tasks of a noise cancelling system. Based on the above mentioned assumption that the noise part of the signal is stationary over longer periods of time than the speech part, an estimate of the noise is obtained by extracting slowly changing portions of the signal spectrum. The output frame is obtained by applying the inverse frequency transformation to the weighted enhanced spectrum Y(n, Ωi ) and the noisy phase x(n, Ωi). noise suppression system consists of subsystems as shown in Fig.4. The input to the noise suppressor are the noisy speech samples s(n) which have been previously high-pass filtered. These are passed through a pre-emphasis filter and transformed into the frequency-domain values G(k). In the frequency domain, a filtering operation is performed by multiplying G(k) by the scalar gain values Y(k) to yield H(k). The filtered spectral values H(k) are transformed back into the time domain and passed through a de-emphasis filter to provide the noise suppressed speech samples s (n) to the speech coder. The channel energy estimator divides this spectrum into N c channels and calculates an estimate of the signal energy in each one. The spectral deviation estimator calculates the difference between the current channel energies and an average long-term estimate. An estimated signal-to noise ratio is calculated by the SNR estimator, using the channel energy and background noise estimates. The SNR estimate is used to calculate the voice metric, which is a weighted sum which provides an estimate of the signal "quality". It is used mainly as an indication as to whether or not the current frame contains speech. When the input signal is deemed to contain no speech, the background noise estimator is updated. Under some conditions the SNR estimates are changed by the SNR modifier. Based on the (modified) SNR estimates and the background noise the gains for each channel are calculated by the channel gain calculator. These gains are then used to perform the filtering of the input signal. The overall gain factor for the current frame, γn, is calculated according to γn=max {γ min-10log 10 ( 1 Efloor NC 1 En(m, i) )} (1) i=0 Fig. 4.TIA 127-B (Narrow Band) Speech Enhancement System. A TIA/EIA/IS127-B Compliant Speech Enhancement System is pre-processing block in Enhanced Variable Rate Codec (EVRC) used to enhance the speech signal before encoding the speech signal. The main components of the TIA/EIA/IS127-B Compliant Speech Enhancement System are: 1. High Pass System 2. Adaptive Noise Suppression System. High Pass System comprises 6 th order Butterworth filter implemented using 3 sections of Biquad Filter. Adaptive Where γmin=-13 is the minimum overall gain, Efloor = 1 is the noise floor energy and En(m; i) is the estimated noise spectrum calculated during the previous frame. The dbscale channel gains are calculated as γ db(i)=µ g(σ (i)-σ th)+γ n; 0 i < N c (2) Where μg=0.39 is the gain slope and σ th the SNR threshold, both constants. In Fig 4 the gain curve for a single channel resulting from following equation is plotted in comparison with the gain curve resulting from the spectral subtraction rule. σ = Max (σ th σ) (3)

To simulate the single channel behaviour of EVRC-NS (3) was used. As is evident, the gain curve for EVRC-NS is quite different from that of spectral subtraction. The channel gains are converted to linear scale according to Table I presents the results of Correlation Coefficient, Segmental SNR, Log Spectral Distance, Vector Quantization based Minimum Mean Euclidean Distance values for various noise types and levels obtained by using the EVRC TIA-127-B speech enhancement system. γch(i) = min {1,10 γdb(i) 20 } 0 i < N c (4) VI. CONCLUSION In our implementation, the input speech is presented to the noise suppressor in frames of 80 samples (10 ms frames at 8 khz sampling). These samples along with 24 samples of the previous frame are multiplied by a smoothed trapezoidal window and transformed into the frequency domain by a 128-point FFT In the frequency domain, the spectral values are grouped together to form 16 unequal frequency bands (similar to critical bands) referred to as channels. A scalar gain value is computed for each channel and applied to all the spectral values corresponding to that channel including both positive and negative frequencies. The filtered values Y(k) are transformed back into time domain using a 128-point IFFT and overlap-added with the last 48 noise-suppressed samples of the previous frame. The first 80 samples are then released to the speech coder. It is seen that the noise suppressor essentially operates as a timeadaptive filter V. SIMULATION RESULT A. Original Speech Sample corrupted by Noise B. Recovered speech sample obtained from EVRC Algorithm A noise suppression algorithm based on EVRC TIA/EIA/IS127-B has been proposed. The proposed algorithm continuously updates the noise estimate by noisy speech in accordance with an estimated SNR. The spectral gain is modified with the SNR so that it better fits the new noise estimate for higher speech quality. VII. REFERENCES [1] Gongping Huang, Jacob Benesty, Tao Long, and Jingdong Chen, A Family of Maximum SNR Filters for Noise Reduction, IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2014 [2] Xiang Chen, Tong Shao, Digital Phase Noise Cancellation for a Coherent-Detection Microwave Photonic Link, IEEE PHOTONICS TECHNOLOGY LETTERS, VOL. 26, NO. 8, APRIL 15, 2014 [3] H. Thumchirdchupong and N. Tangsangiumvisai, A Two- Microphone Noise Reduction Scheme for Hands-Free Telephony in A Car Environment, 978-1-4799-0545-4/13 2013 IEEE [4] J. A. Tavares Reyes, E. Escamilla Hernández, J. C. Sánchez García, DSP-Based Oversampling Adaptive Noise Canceller for Background Noise Reduction for Mobile Phones, 978-1-61284-1325-5/12 2012 IEEE [5] Zhinxin Chen, Simulation of Spectral Based Noise Reduction Method, International Journal of Advanced Computer Science and Applications, Vol.2, No8, 2011. [6] Ekaterina Verteletskaya, Boris Simak, Noise Reduction Based on Modified Spectral Subtraction Method, International Journal of Computer Science, 2011. [7] Yoshihisa Uemura, Yu Takahashi, Hiroshi Saruwatari, Kiyohiro Shikano and Kazunobu Kondo, Musical Noise Generation Analysis For Noise Reduction Methods Based On Spectral Subtraction and MMSE STSA Estimation, ICASSP 2009. [8] M.Kato, A.Sugiyama and M.Serizawa, Noise Suppression with High Speech Quality Based on Weighted Noise Estimation and MMSE STSA, Technical report of IEICE, DSP/IE/MI2001-8,pp.53-60, Apr.2001. [9] D.Deepa, A.Vijay, D.Hema Priya and A.Shanmugam, Enhancement of Noisy Speech Signal Based on Variance and Modified Gain Function,IEEE,2011. TABLE I : Object Measures For Various Noise Types