Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Similar documents
Digital Signal Processing The Breadth and Depth of DSP

Mel Spectrum Analysis of Speech Recognition using Single Microphone

EE482: Digital Signal Processing Applications

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Digital Speech Processing and Coding

Speech Enhancement using Wiener filtering

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

TCET3202 Analog and digital Communications II

Keywords Decomposition; Reconstruction; SNR; Speech signal; Super soft Thresholding.

Voice Activity Detection for Speech Enhancement Applications

SGN Audio and Speech Processing

Enhancement of Speech Signal Based on Improved Minima Controlled Recursive Averaging and Independent Component Analysis

Enhancement of Speech in Noisy Conditions

Recent Advances in Acoustic Signal Extraction and Dereverberation

Students: Avihay Barazany Royi Levy Supervisor: Kuti Avargel In Association with: Zoran, Haifa

Auditory modelling for speech processing in the perceptual domain

Automotive three-microphone voice activity detector and noise-canceller

Speech Enhancement Using Beamforming Dr. G. Ramesh Babu 1, D. Lavanya 2, B. Yamuna 2, H. Divya 2, B. Shiva Kumar 2, B.

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

Advanced Functions of Java-DSP for use in Electrical and Computer Engineering Senior Level Courses

MMSE STSA Based Techniques for Single channel Speech Enhancement Application Simit Shah 1, Roma Patel 2

SPECTRAL COMBINING FOR MICROPHONE DIVERSITY SYSTEMS

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Epoch Extraction From Emotional Speech

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

EC 2301 Digital communication Question bank

APPLICATIONS OF DSP OBJECTIVES

Overview of Code Excited Linear Predictive Coder

Audio Restoration Based on DSP Tools

EEE 309 Communication Theory

Digital Signal Processing Lecture 1

Speech Enhancement Techniques using Wiener Filter and Subspace Filter

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

NOISE ESTIMATION IN A SINGLE CHANNEL

Frequency Domain Implementation of Advanced Speech Enhancement System on TMS320C6713DSK

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

3 RD GENERATION BE HEARD AND HEAR, LOUD AND CLEAR

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

Voice Transmission --Basic Concepts--

Robust Low-Resource Sound Localization in Correlated Noise

SGN Audio and Speech Processing

Sound Processing Technologies for Realistic Sensations in Teleworking

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Source Separation and Echo Cancellation Using Independent Component Analysis and DWT

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

A Two-step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis

GSM Interference Cancellation For Forensic Audio

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

ROBUST echo cancellation requires a method for adjusting

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Speech Enhancement Based On Noise Reduction

VHF Radar Target Detection in the Presence of Clutter *

Speech Enhancement for Nonstationary Noise Environments

EEE 309 Communication Theory

Voice Excited Lpc for Speech Compression by V/Uv Classification

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

Using RASTA in task independent TANDEM feature extraction

Department of Electronics and Communication Engineering 1

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Chapter 2: Digitization of Sound

EE482: Digital Signal Processing Applications

Signal Processing Toolbox

Audio processing methods on marine mammal vocalizations

REAL-TIME BROADBAND NOISE REDUCTION

Chapter 3. Speech Enhancement and Detection Techniques: Transform Domain

ROBUST PITCH TRACKING USING LINEAR REGRESSION OF THE PHASE

Chapter 4 SPEECH ENHANCEMENT

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Optimal Adaptive Filtering Technique for Tamil Speech Enhancement

Gerhard Schmidt / Tim Haulick Recent Tends for Improving Automotive Speech Enhancement Systems. Geneva, 5-7 March 2008

RECENTLY, there has been an increasing interest in noisy

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or

Comparison of CELP speech coder with a wavelet method

Modulator Domain Adaptive Gain Equalizer for Speech Enhancement

Single Channel Speaker Segregation using Sinusoidal Residual Modeling

Introduction of Audio and Music

Communications Theory and Engineering

Performance analysis of voice activity detection algorithm for robust speech recognition system under different noisy environment

Overview of Digital Signal Processing

Single-Microphone Speech Dereverberation based on Multiple-Step Linear Predictive Inverse Filtering and Spectral Subtraction

EE390 Final Exam Fall Term 2002 Friday, December 13, 2002

Basic Characteristics of Speech Signal Analysis

Online Version Only. Book made by this file is ILLEGAL. 2. Mathematical Description

Audio Signal Compression using DCT and LPC Techniques

Can binary masks improve intelligibility?

Monophony/Polyphony Classification System using Fourier of Fourier Transform

Drum Transcription Based on Independent Subspace Analysis

Speech Synthesis using Mel-Cepstral Coefficient Feature

Audio /Video Signal Processing. Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau

Pulse Code Modulation

CODING TECHNIQUES FOR ANALOG SOURCES

Modulation Domain Spectral Subtraction for Speech Enhancement

Overview of Signal Processing

The psychoacoustics of reverberation

Transcription:

International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*, and T.S.Yuvarani Department of Electronics and Communication Systems Nehru Arts and Science College, Coimbatore- 641 105, India ABSTRACT: This paper presents a speech enhancement algorithm using a one-microphone for automatic speech recognition system. Speech signal received in an enclosed room is distorted by reflections from walls and other objectives. This distortion effect named as reverberation degrades the fidelity and intelligibility of input speech in acoustic systems such as hand-free conference telephones and automatic speech recognition. In this project, we consider the importance effect of reverberation on speech signal which is referred to as overlap masking, i.e. the energy of the previous phonemes is smeared over time, and overlaps following phonemes. To reduce this effect, we introduced a one-microphone speech dereverberation algorithm based on spectral subtraction. After processing of spectral subtraction, a residue reverberation still fills some of the silent gaps right after high-intensity speech sections. Therefore, we employ a Voice Activity Detector (VAD) using spectral entropy and then attenuate these silent gaps. After the process the signal will be encoded by the DPCM coding. KEYWORDS:Voice Activity Detector, reverberation, DPCM encoding, Spectral Subtraction. I. INTRODUCTION Signal Processing Digital Signal Processing is distinguished from other areas of computer science by the unique type of data it uses: signals. In most cases, these signals originate as sensory data from the real world: seismic vibrations, visual images, sound waves, etc. DSP is the mathematics, the algorithms and the techniques used to manipulate these signals after they have been converted into a digital form. This includes a wide variety of goals, such as: enhancement of visual images, recognition and generation of speech, compression of data for storage and transmission, etc. Audio Processing The two principal human senses are vision and hearing. Correspondingly, much of DSP is related to image and audio processing. DSP can provide several important functions during mix down, including: filtering, signal addition and subtraction, signal editing, etc. One of the most interesting DSP applications in music preparation is artificial reverberation. Speech Generation Speech generation and recognition are used to communicate between human and machines. Two approaches are used for computer generated speech: digital recording and vocal tract simulation. In digital recording, the voice of a human speaker is digitized and stored, usually in a compressed form. During playback, the stored data are uncompressed and converted back into an analog signal. This is the most common method of digital speech generation used today. Vocal tract simulators are more complicated, trying to mimic the physical mechanisms by which human create speech. Speech Recognition Acoustic-phonetic recognition is based on distinguishing the phonemes of a language. First, the speech is analyzed and a set of phoneme hypotheses are made. IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 51

These hypotheses correspond to the closest recognized phonemes in the order that they are introduced to the system. Next, the phoneme hypotheses are compared against stored words and the word that best matches the hypothesis is picked. Existing System In existing system, a multi microphone for signaling input. That is more than one microphone used in a seminar hall or room. When several microphones are placed in a room, it will get the signal easily from all the directions After removing the noise signal using spectral subtraction, some of the silent gaps will be present in a signal. Proposed System In this system, we are using a single microphone system [2]. So reverberation in signal will occur more. That is very much higher than multi microphone system. That are eliminated by spectral subtraction and the silent gaps also be removed by the Voice Activity Detector [3]. After processing the signal, the output signal is encoded using DPCM encoding at transmitter and decoding the process at the receiver. Problem Definition Reverberation is an acoustical distortion which degrades the fidelity and intelligibility of speech signal in a speech recognition system. This Paper presents a speech enhancement algorithm using a one-microphone for automatic speech recognition system. The proposed algorithm is based on a simple spectral subtraction. Overview The spectral subtraction method is a well-known noise reduction technique. Most implementations and variations of the basic technique advocate subtraction of the noise spectrum estimate over the entire speech spectrum. However, in real world noise is mostly colored and does not affect the speech signal uniformly over the entire spectrum. To improve the system performance, we employ a method of Voice Activity Detection (VAD) using spectral entropy [3]. VAD also known as speech activity detection or speech detection is a technique used in speech processing in which the presence or absence of human speech is detected. The main uses of VAD are in speech coding and speech recognition. It can facilitate speech processing, and can also be used to deactivate some processes during non-speech section of an audio session. Distortion effect named as reverberation degrades the fidelity and intelligibility of input speech in acoustic systems such as hand-free conference telephones and automatic speech recognition. Therefore to improve the performance of speech recognition system, it is necessary to investigate the application of signal processing techniques to the speech enhancement. Here, we consider the importance effect of reverberation on speech signal which is referred to as overlap masking. To reduce this effect, we introduced a one-microphone speech dereverberation algorithm based on spectral subtraction. Spectral subtraction has been used widely in speech enhancement [2]. After processing of spectral subtraction, a residue reverberation still fills some of the silent gaps right after high-intensity speech sections. Therefore, to further improve system performance by reduction of this residue reverberation, we employ a Voice Activity Detector (VAD) using spectral entropy and then attenuate these silent gaps. After the process the signal will be encoded by the DPCM coding. IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 52

Block Diagram Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm Figure: Block Diagram of Speech Enhancement Algorithm The block diagram of the speech enhancement algorithm is shown. Prior to speech recognition, input speech signal is pre-processed by spectral subtraction and reverberation reduction for silent gap with VAD [2]. The received speech signal x(n) is decomposed into a Short-Time Fourier Transform (STFT)[1]. The analysis window of time domain is Hamming window and overlap between two successive windows is set to 50%. Then the Power Spectral Density (PSD) of the reverberation is estimated by autocorrelation function of received signal x(n). The square root of this estimate is then subtracted from magnitude spectrum of the reverberated signal that yielding an estimate of the magnitude spectrum of the dereverberated signal. This is in practice realized by a short-term spectral attenuation, equivalent to spectral subtraction. One problem of a result from spectral subtracted speech signal is that residue reverberation still fills some of the silent gaps right after high-intensity speech sections. Therefore it is necessary to employ the VAD techniques to identify and then attenuate these silent gaps. In this paper we used VAD using feature of spectral entropy which performs better in terms of correct decision for silent gaps than typical feature of energy threshold. Voice Activity Detection The basic function of a VAD algorithm is to extract some measured features or quantities from the input signal and to compare these values with thresholds, usually extracted from the characteristics of the noise and speech signals. Then, voice-active decision is made if the measured values exceed the thresholds. Algorithm Overview The typical design of a VAD algorithm is as follows 1. There may first be a noise reduction stage, e.g. via spectral subtraction. 2. Then some features or quantities are calculated from a section of the input signal. 3. A classification rule is applied to classify the section as speech or non-speech - often this classification rule finds when a value exceeds a threshold. The Process Of Echo Cancellation An echo canceller is basically a device that detects and removes the echo of the signal from the far end after it has echoed on the local end s equipment. In the case of circuit switched long distance networks, echo cancellers reside in the metropolitan Central Offices that connect to the long distance network. These echo cancellers remove electrical echoes made noticeable by delay in the long distance network. An echo canceller consists of three main functional components: Adaptive filter. Doubletalk detector. Non-linear processor. IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 53

Enhancement Of Noisy Speech One of the accepted conventional techniques for noise suppression is spectral subtraction, in which the noise power spectrum is estimated in intervals between speeches and subtracted from a power spectrum of the signal [2]. The enhanced signal is then reconstructed by an overlap-add inverse Fourier transform using the modified magnitude and the original noisy phase of the signal spectrum. Differential Pulse Code Modulation Differential pulse code modulation (DPCM) is method of converting analog to digital signal in which analog signal is sampled and then difference between actual sample value and its predicted value is quantized and then encoded forming digital value. Concept of DPCM is coding a difference. It is based on the fact that most source signals shows significant correlation between successive samples so encoding uses redundancy in sample values which implies lower bit rate. Outputs Main Window IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 54

Input Input With Dialog IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 55

SS And VAD Speech Recognition IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 56

PCM With dialog DPCM With dialog IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 57

II. CONCLUSION The proposed dereverberation method for speech recognition system was designed using spectral subtraction and VAD algorithm. We tested this method by comparing with previous method in terms of values of Reverberation Reduction and speech recognition scores. As a result, the proposed method represents a good performance than previous method using features of energy detection. REFERENCES [1]. E.A.P. Habets, "Single-Channel Speech Dereverberation based on Spectral Subtraction," In Proc. ProRISC 2004, the 15th Annual Workshop on Circuits, Systems and Signal Processing, Veldhoven, Netherlands, pp. 250-254. 2004. Mingyang Wu and DeLiang Wang, "A two-stage algorithm for one-microphone reverberant speech enhancement," IEEE Trans. Speech Audio Process., Vol. 14, no. 3, pp. 774-784, 2006. [2]. R. V. Prasad, R. Muralishankar and S. Vijay, "Voice Activity Detection for VoIP-An Information Theoretic Approach," in proc. IEEE Int. Conf. Telecommunications, pp. 1-6, 2006. IJMER ISSN: 2249 6645 www.ijmer.com Vol. 5 Iss. 7 July 2015 58