Speech Compression based on Psychoacoustic Model and A General Approach for Filter Bank Design using Optimization

Size: px
Start display at page:

Download "Speech Compression based on Psychoacoustic Model and A General Approach for Filter Bank Design using Optimization"

Transcription

1 The International Arab Conference on Information Technology (ACIT 3) Speech Compression based on Psychoacoustic Model and A General Approach for Filter Bank Design using Optimization Mourad Talbi, Chafik Barnoussi, Cherif Adnane Laboratory of signal processing, Electronic department, Faculty of sciences of Tunis, 6, Tunisia mouradtalbi96@yahoo.fr, Chafik.Barnoussi@gmail.com adnane.cher@fst.rnu.tn Abstract: In this paper we propose a new speech compression technique based on the application of a psychoacoustic model combined with a general approach for Filter Bank Design using optimization. This technique is a modified version of the compression technique using a MDCT (Modified Discrete Cosine Transform) filter banks of 3 filters each and a psychoacoustic model. The two techniques are evaluated and compared with each other by computing bits before and bits after compression. They are tested on different speech signals and the obtained simulation results show that the proposed technique outperforms the second technique and this in term of compressed file size. In term of speech quality, the outputs speech signals of the proposed compression system are with good quality. This is justified by SNR, PSNR, NRMSE and PESQ computation. Keywords: speech compression, psychoacoustic model, Filterbank Design, optimization, bits before/bits after compression.. Introduction Accompanying rapidly increasing number of mobile users and the explosive growth of Internet has made speech compression as a research important issue in digital speech processing file. The essential purpose in speech compression consists in representing with minimum number of bits, the digital speech waveform while preserving its perceptual quality [, ]. The speech compression is essential either for reducing memory storage requirements or for reducing transmission bandwidth requirements, without harming the speech quality. For example, digital cellular phones use some compression techniques for compressing, in real-time, the speech signal over general switched telephone networks. Speech compression is also needed for reducing the storage requirements for storing the voice messages or for mail forwarding of voice messages. All these applications depend on the efficiency of the speech compression technique. Consequently, in the past, different techniques [3] were developed to meet the rising demand for better algorithms of speech compression. Speech signal is a simple manner for human to convey information with emotion from one person to others or from one place to another. It is characteristically classified as either unvoiced, voiced or a mixture of the two. Unvoiced sounds are produced when the vocal cords are too slack or tense to vibrate periodically and Voiced sounds are generated when the vocal cords are held together [4]. A detailed study on speech production is given in [5] and the references therein. Alike to other digital data compression techniques, speech compression techniques can also be classified into two categories which are lossy compression and lossless compression. Lossless compression is frequently performed by waveform coding techniques. In these techniques [5, 4, 6] actual shape of the signal produced by the microphone and its associated circuits is conserved. A most popular waveform coding technique is pulse code modulation (PCM). Other lossless techniques such as differential quantization and adaptive PCM, make speech signals compression by localizing redundancy and optimizing or suppressing it by the quantization process. All such techniques require simple signal processing and lead to minimum distortion with small compression [6, 7, 8]. A detailed study on these techniques is presented in [6, 7], [], [4], [9] and the references therein. Concerning lossy compression, the compressed data is a close approximation of the original data and not the same as it. Although, it leads to a much higher compression ratio than that of lossless compression. The literature review reveals that a considerable progress has been made on lossy compression techniques such as sub-band coding [4], linear predictive coding (LPC) [3] and turning point [5]. In these techniques, more sophisticated signal processing techniques are used. LPC is a robust tool extensively employed for the analysis of the ECG and speech signals in various aspects such as adaptive filtering, spectral estimation and data compression [, 3, 4]. Different efficient techniques [, 4, 5] have been reported in literature based on LPC. While in subband decomposition, spectral information is divided into a set of signals that can then be encoded by using a diversity of techniques. Based on subband decomposition, different techniques have been devised for speech compression [6, 7, 8, 9]. During last decade, Wavelet Transform, more precisely Discrete Wavelet Transform has emerged as a robust and powerful tool for extracting and analyzing information from non-stationary signal because of time varying nature of these signals. Nonstationary signals are characterized by transitory drifts, trends and numerous abrupt changes. Wavelet has localization feature along with its time-frequency resolution properties which makes it appropriate for analyzing non-stationary signals such as speech signals []. Actually, many wavelet or wavelet packet techniques have been developed for compressing speech signal [,, 3, 4, 5]. In the above context, the optimized wavelet filters are developed for speech

2 compression, and the filter coefficients are derived from linear optimization employing different windows [6]. In this paper, we have replaced a MDCT (Modified Discrete Cosine Transform) filter banks of 3 filters each used in the compression system proposed by Alex et al [7], by a Non- Uniform Filter Bank which is designed using optimization [8]. The rest of the paper is organized as follows. A background on the psychoacoustic model is provided in the following section. Section 3 deals with Filterbank and section 4 outlines the proposed speech compression scheme. In section 6, we give results and discussion and finally we give our conclusion.. Background on the Psychoacoustic Model The psychoacoustic model is based on many researches made on human perception. These researches have demonstrated that the average human hearing of all frequencies is not the same. Effects due to the human sensory system limitations and different sounds in the environment lead to facts that can be employed in order to remove unnecessary data contained in an audio signal [7]. The two principal human auditory system properties that make up the psychoacoustic model are the auditory masking and the hearing absolute threshold. Each of them provides a manner of determining which signal portions are indiscernible and inaudible to the average human, and can therefore be eliminated from a signal [7]... Absolute Threshold of Hearing The human hearing frequencies are in the range from Hz to Hz. Although, this does not mean that all frequencies are heard in the same manner. We can make the supposition that humans hear frequencies that make up speech better than others; this is a good guess [7]. Moreover, we can also hypothesize that hearing a tone becomes more difficult as its frequency nears either of the extremes. One other observation forms the basis for modeling. Due to the fact that humans hear lower frequencies, like those making up speech, more than others, like high frequencies around khz, the ear probably has better ability in detecting differences in pitch at lower frequencies than at high ones. For example, a human has an easier time telling the difference between 5 Hz and 6 Hz than he does determining whether something is 7, Hz or 8, Hz [7]. Many studies made by scientists leaded to the fact that the frequency range from Hz to Hz can be broken up into critical bandwidths, which are non-linear, nonuniform and dependent on the heard sound. Signals within one critical bandwidth are hard to separate for a human observer [7]. A more frequency uniform measure based on critical bandwidths is the Bark. From the earlier discussed remarks, we would expect a Bark bandwidth to be larger at high frequencies and smaller at low ones. Indeed, this is the case. The Bark frequency scale can be approximately expressed as follow: barks 3 arctg.76 f Hz 3.5 arctg 75 To determine the frequency effect on hearing capability, researchers played a sinusoidal tone at a very low power. The power was slowly raised until the subject could hear the tone. This level is the threshold at which we can hear the tone. The process is repeated for many frequencies with many subjects () and in the human auditory range. This experimental data can be modeled by the equation (): ATH f f exp f f 3 where f designates the frequency in Hertz. ( dbspl) Therefore, the following jump for the purposes of compression, can be made. If a signal has any frequency components with power levels that fall below the absolute threshold of hearing, then these components can be removed, as the average listener will be not capable to hear those frequencies of the signal anyway [7]... Auditory Masking Humans do not have the capability of hearing of minute differences in frequency. For example, it is very difficult to distinguish a,hz signal from one that is,hz. This becomes even more difficult if the two signals are playing at the same time. Moreover, the, Hz signal would also affect a human's capability of hearing of a signal that is, Hz, or, Hz, or 99 Hz. This concept is known as masking. If the, Hz signal is strong, it will mask signals at nearby frequencies, making them inaudible to the listener. For a masked signal to be heard, its power will need to be increased to a level greater than that of a threshold that is determined by the frequency of the masker tone and its strength.... Tone Maskers To determine if a frequency component is a tone, this necessitates knowing if it has been held constant for a period of time, as well as if it is a sharp peak in the frequency spectrum, which indicates that it is above the ambient noise of the signal. To determine if a certain frequency is a tone (masker) can be done with the definition given in [7].... Noise Maskers When a signal is not a tone then it is necessary a noise. Therefore, all frequency components that are not part of a tone's neighborhood, are taken and are treated as noise. Combining such components into maskers, though, takes a little more thought. Because humans have difficulty discriminating signals within a critical band, the noise found within each of the bands can be combined in order to obtain one mask. Therefore, the idea consists in taking all frequency components within a critical band that do not fit within tone neighborhoods, add them together, and place them at the geometric mean location within the critical band. This is repeated for all critical bands [7]...3. Masking Effect The determined maskers affect not only the frequencies within a critical band, but also in surrounding bands. Studies prove that the spreading of this masking has an approximate slope of +5 db/bark before and db/bark after the masker. The spreading can be described as a function depending on the ()

3 masker location j, masker location i, the power spectrum Ptm at j, and the difference between the masker and masker locations in Barks [7]. There is a slight difference in the resulting mask that depends on the nature of the mask whether it is a noise or a tone. Consequently, one can model the masks by the following equations, with the same variables as described above: For noise: Tnm i, j Pnm j 75z( SF( i, 5 (3) For tones: Tnm i, j Pnm j 75z( SF( i, 6 5 (4) Obviously, if there are multiple tone and noise, the overall effect is a little harder to determine. In their work, alex et all [7] suppose that the effects are power additive. This is a reasonable supposition to make, but note that there is a definitely an interplay that can occur between maskers that would lower or increase thresholds [7]. 3. Quantization Simulation Alex et al [7] developed two different quantization techniques for performing the audio compression. The first technique, named full range quantization, requires a predefined range that includes all possible input values. Since this technique gives a noticeable degradation of sound quality, they decided to develop a different technique of quantization. The second is a dynamic technique, named narrow range quantization, which determining the quantization range and the delta based on the current set of input data. The inputs to be quantized can range from [, ], and it is quantized with 6 bits (input has 65,536 distinct values between [, ]) [7]. 4. Filter Bank Filter bank is an array of band-pass filters that spans the entire audible frequency spectrum. Figure illustrates a Filter banks with M banks. Figure. Filter banks. The bank serves for isolating different frequency components in a signal. This is useful since some frequencies are deemed more important than others through the use of the psychoacoustic model. Magnitudes at these important frequencies require to be coded with a fine resolution. Small differences at these frequencies are significant and a coding scheme that conserves these differences must be used. On the other hand, frequencies that are less important do not have to be exact. A coarser coding scheme can be employed, although some of the finer details will be lost in the coding. We can obtain different coding resolutions by employing less bits to encode less significant frequencies and many bits to encode important frequencies. The filter bank, then, permits different signal parts to be encoded with a different number of bits resulting in a compressed data stream representation of the signal. Really, there are two sets of filter banks that are employed. The first set of filters is named the analysis filter bank. The input signal passes through each of these filters and is after that quantized with the proper number of bits, as determined by the psychoacoustic model. After that the signal requires to be reconstructed again from the quantized individual components. This is performed through a bank of synthesis filters. Finally, all the outputs of the synthesis filters are added together in order to reconstruct the final compressed output signal. There is one final point to make. Once the signal is passed through each filter in the analysis bank, it is then down-sampled by the number of filters in the bank. This is because there is redundant information present in each of the signals output from the filters. The decimation does not result in any loss of information, but does shift the frequency of the signal. After quantization, the signal is up-sampled to restore the frequency content to its original scale. Figure 3 illustrates a Analysis and Synthesis Filter Bank Setup [7]. Figure. Analysis and Synthesis Filter Bank Setup. 4.. Filter Bank Design Considerations A tradeoff exists between coarse and fine frequency resolution. For all signals, no single tradeoff is optimal. Take, for instance, a castanets and piccolo, which are two musical instruments having very different qualities. A harmonic piccolo calls for fine frequency resolution and coarse time resolution. This is because a piccolo plays within a small range, consequently necessitating more filters per bank for the purpose to sufficiently capture all tones. The contradictory is true for castanets which are localized in time but widely dispersed in frequency. In this case, you would want to use less filters per bank. Furthermore, many signals are nonstationary and need the coder to make adaptive decisions regarding the optimal time-frequency tradeoff. For their purposes, Alex and al [7] have employed non-adaptive filter banks that are commonly used in audio applications. 5. The Proposed Method Alex et al [7] implemented a compression scheme that uses psychoacoustic modeling to determine which portions of the audio signal; they remove without loss of sound quality to the human ear. In their compression system, the original signal is run through cosine modulated perfect reconstruction filter banks having 3 filters in each bank. This MDCT filter banks of 3 filters each used by Alex et al [7] are defined as follow: hk, n w n cos( n M ( k ) /(4M )) (5) M g k, L n h k, n (6) With: k M, n L, L M, M 3 and w ( n ) sin n. 5 M

4 The signal is divided by the filter banks into distinct frequency components and then it is quantized with a variable number of bits, which is based on the psychoacoustic model results. They have made analysis on this compressed version of the signal and by using different quantization schemes they can get 3 to 75 percent compression of the original signal. This difference is due to the overhead needed for decoding the quantized signal in each scheme. Here is a simplified block diagram of their scheme (Figure 3) [7]: Table. Coefficients of Analysis-Synthesis responses impulses of the designed filter bank using optimization. h h g g Figure 3. Encoding/Decoding systems. In this paper, we have modified the compression system of Alex et al [7] (figure 6) by replacing the MDCT (Modified Discrete Cosine Transform) filter banks of 3 filters each by a Uniform/Non-Uniform Filter Bank which is designed using optimization [8]. The goal is to design M analysis and synthesis FIR filters so that the analysis filters satisfy some frequency specifications and the filter bank (almost) meets the perfect reconstruction (PR) conditions. Both goals are achieved by minimizing the following performance index [8]: J w PR error (7) w Frequency Specificat ion error where w and w are optional weights. The algorithm can design both uniform (critically/over sampled) and non-uniform filter banks [8]. Figure 4 illustrates the used non-uniform filter bank: Figure 4. Analysis-Synthesis optimized filterbank. Where H (z), H (z), G (z) and G (z) are respectively the z- transforms of the impulse responses of the analysis and synthesis filters, h (n), h (n), g (n) and g (n). Those impulses are obtained from optimization by minimizing the performance index given by (8). Therefore we have replaced the impulses responses h and g associated to MDCT filters banks of 3 filters in each bank and given by (6) and (7) by h, h, g and g. Table reports those impulses coefficients obtained from optimization. h and h are designed for analysis and g and g are designed for synthesis Performance Evaluation In this paper, we present the objective criterions used for the evaluation and the comparison between the proposed speech compression technique and that of Alex et al [7]. These criteria are bits before and bits after compression, SNR, PSNR, NRMSE and PESQ. The output speech quality is objectively evaluated for the proposed and the conventional speech compression techniques in case of narrow range quantization. This evaluation is performed by SNR, PSNR, PESQ and NRMSE computing. These objectives criteria are defined as follows: o Signal-to-noise ratio SNR: o Perceptual evaluation of speech quality (PESQ): The perceptual evaluation of speech quality (PESQ) algorithm is an objective quality measure that is approved as the ITU-T recommendation P.86. It is a tool of objective measurement conceived to predict the results of a subjective Mean Opinion Score (MOS) test. It was proved [9] that the PESQ is more reliable and correlated better with MOS than the traditional objective speech measures. o o Peak Signal to Noise Ratio (PSNR): Normalized Root Mean Square Error (NRMSE): (8) (9) () Where s(n) and ŝ(n) represent respectively the original and the reconstructed signal, and N is the samples number per signal, µ s (n) is the mean of the speech signal s(n). 6.. File Format and Comparison To determine compression ratios for our compression schemes we first have to determine the number of bytes that each file

5 takes. We have use the same computation rules of files size (Original files size, 6-bit Compression, 8-bit Compression, Full Range Compression, and Narrow Range Compression) as used in [7]. Table.3. Results (bits before/ bits after compression) obtained from research work of Alex et al [7]. 7. Results and discussion Figure 6 illustrates an example of a reconstructed speech signal obtained by applying the proposed speech compression technique and the technique of Alex et al [7] (a) (b) (c) Figure 6. (a) original speech signal, (b) reconstructed speech signal using Alex compression scheme [7], (c) reconstructed speech signal using the proposed compression scheme. This figure 6 shows clearly that the proposed technique permits to obtain a reconstructed speech signal with a good quality and this by referring to the original speech signal. However the compression technique proposed by Alex et al [7] introduced some degradation on the reconstructed speech signal. Table. reports the results concerning Bits before and after compression using the proposed technique and the technique of Alex et al [7]. The two techniques are applied to three different speech signals. Table. Bits before compression and Bits after compression. These obtained results show clearly that the proposed technique outperforms the technique of Alex et al [7] and this in term of size of output files: sizes of output files from the proposed compression system are smaller than sizes of the output files from the system of Alex et al [7]. Table.3. reports the results (bytes before and after compression) obtained from the application of the technique of Alex et al [7] to a number of audio signals and sinwave signals. According to Alex et al [7], we can obtain the two following constitutions: For Full Range, we have Smallest File and Worst Sound Quality. For Narrow Range, we have better sound quality and larger File. In this work and for the tested speech signals, we can obtain the two following constitutions: For full Range we have smallest file and better sound quality. For narrow Range we have a completely degraded Sound Quality and larger File. To solve the problem of speech degradation when using narrow range in the proposed technique and the technique of Alex [7], we have multiplied the psychoacoustic model threshold by an adjustment factor. We have selected to be equals to 3 and this choice is based on simulation results. Figures 7 and 8 show clearly that by multiplying the psychoacoustic model threshold by the factor α, we obtain an output speech signal with a good quality (a) (b) (c) Figure 7: (a) Original Speech signal, (b) degraded output speech signal obtained from the compression system of Alex et al [7] without multiplying the threshold by, (c) output speech signal from the compression system of Alex et al [7] with multiplying the threshold by.

6 .5 Table 5 PESQ values of the reconstructed speech signal in case of Narrow range (a) (b) (c) Figure 8: (a) Original Speech signal, (b) degraded output speech signal obtained from the proposed compression system without multiplying the threshold by, (c) output speech signal from the proposed compression system with multiplying the threshold by. Table 4 reports the results obtained from the application of the two techniques on the three speech signals and this for narrow range and with/without multiplying the psychoacoustic model threshold by the factor. These results are bits before and after speech compressing. Tables 4 and 5 show that the performances of the proposed speech compressed system and that of Alex et al [8] are improved and the outputs of these systems having good qualities when multiplying the psychoacoustic model threshold by the factor α. The output speech signals of the proposed speech compression system, are with a little constant delay compared to the original speech signals and this in case of narrow range. To solve this problem we have suppressed this delay and we have obtained the following results reported in table 6. Table 6. SNR,PSNR, PESQ and NRMSE of Alex et al method and the proposed speech compression for Narrow range quantization. Table.4. Bits After for Narrow Range in the two cases, multiplying and without multiplying by factor. References [] Xie, N., Dong, G., & Zhang, T. (). Using lossless data compression in data storage systems: not for saving space. IEEE Transactions on Computers, 6(3),

7 [] Gibson, J. D. (5). Speech coding methods, standards, and applications. IEEE Circuits and Systems Magazine, 5(4), [3] Junejo, N., Ahmed, N., Unar, M. A., & Rajput, A. Q. K. (5). Speech and image compression using discrete wavelet transform. In IEEE symposium on advances in wired and wireless communication (pp ). [4] Agbinya, J. I. (996). Discrete wavelet transform techniques in speech processing. In IEEE Tencon digital signal processing applications proceedings (pp ). New York: IEEE. [5] Arif, M., & Anand, R. S. (). Turning point algorithm for speech signal compression. International Journal of Speech Technology. doi:.7/s [6] Gersho, A. (99). Speech coding. In A. N. Ince (Ed.), Digital speech processing (pp. 73 ). Boston: Kluwer Academic. [7] Gersho, A. (994). Advance in speech and audio compression. Proceedings of the IEEE, 8(6), [8] Shlomot, E., Cuperman, V., & Gersho, A. (998). Combined harmonic and waveform coding of speech at low bit rates. In IEEE conference on acoustics, speech and signal processing (ICASSP98) (Vol., pp ). [9] Shlomot, E., Cuperman, V., & Gersho, A. (). Hybrid coding: combined harmonic and waveform coding of speech at 4 kb/s. IEEE Transactions on Speech and Audio Processing, 9(6), [] Junejo, N., Ahmed, N., Unar, M. A., & Rajput, A. Q. K. (5). Speech and image compression using discrete wavelet transform. In IEEE symposium on advances in wired and wireless communication (pp ). [] Zois, E. N., & Anastassopoulos, V. (). Morphological waveform coding for writer identification. Pattern Recognition, 33(3), [] Laskar, R. H., Banerjee, K., Talukdar, F. A., & Sreenivasa Rao, K. (). A pitch synchronous approach to design voice conversion system using source-filter correlation. International Journal of Speech Technology, 5, [3] Shahin, I. M. A. (). Speaker identification investigation and analysis in unbiased and biased emotional talking environments. International Journal of Speech Technology, 5, [4] Vankateswaran, P., Sanyal, A., Das, S., Nandi, R., & Sanyal, S. K. (9). An efficient time domain speech compression algorithm based on LPC and sub-band coding techniques. Journal of Communication, 4(6), [5] Magboun, H. M., Ali, N., Osman, M. A., & Alfandi, S. A. (). Multimedia speech compression techniques. In IEEE international conference on computing science and information technology (ICCSIT) (Vol. 9, pp ). [6] Osman, M. A., Al, N., Magboud, H. M., & Alfandi, S. A. (). Speech compression using LPC and wavelet. In IEEE international conference on computer engineering and technology (ICCET) (Vol. 7, pp. 9 99). [7] McCauley, J., Ming, J., Stewart, D., & Hanna, P. (5). Subband correlation and robust speech recognition. IEEE Transactions on Speech and Audio Processing, 3(5), [8] Ramchandran, K., Vetterli, M., & Herley, C. (996). Wavelet, subband coding, and best bases. Proceedings of the IEEE, 84(4), [9] [9] Gershikov, E., & Porat, M. (7). On color transforms and bit allocation for optimal subband image compression. Signal Processing. Image Communication,, 8. [] Shao, Y., & Chang, C. H. (). Bayesian separation with sparsity promotion in perceptual wavelet domain for speech enhancement and hybrid speech recognition. IEEE Transactions on Systems, Man and Cybernetics. Part A: System and Humans, 4(), [] Satt, A., & Malah, D. (989). Design of uniform DFT filter banks optimized for subband coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(), [] Joseph, S. M. (). Spoken digit compression using wavelet packet. In IEEE international conference on signal and image processing (ICSIP- ) (pp ). [3] Mallat, S. G. (989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence,, [4] Fgee, E. B., Philips, W. J., & Robertson, W. (999). Comparing audio compression using wavelet with other audio compression schemes. Proceedings IEEE Electrical and Computer Engineering,, [5] Dusan, S., Flanagan, J. L., Karve, A., & Balaraman,M. (7). Speech compression using polynomial approximation. IEEE Transactions on Audio, Speech, and Language Processing, 5(),

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

TRADITIONAL PSYCHOACOUSTIC MODEL AND DAUBECHIES WAVELETS FOR ENHANCED SPEECH CODER PERFORMANCE. Sheetal D. Gunjal 1*, Rajeshree D.

TRADITIONAL PSYCHOACOUSTIC MODEL AND DAUBECHIES WAVELETS FOR ENHANCED SPEECH CODER PERFORMANCE. Sheetal D. Gunjal 1*, Rajeshree D. International Journal of Technology (2015) 2: 190-197 ISSN 2086-9614 IJTech 2015 TRADITIONAL PSYCHOACOUSTIC MODEL AND DAUBECHIES WAVELETS FOR ENHANCED SPEECH CODER PERFORMANCE Sheetal D. Gunjal 1*, Rajeshree

More information

HTTP Compression for 1-D signal based on Multiresolution Analysis and Run length Encoding

HTTP Compression for 1-D signal based on Multiresolution Analysis and Run length Encoding 0 International Conference on Information and Electronics Engineering IPCSIT vol.6 (0) (0) IACSIT Press, Singapore HTTP for -D signal based on Multiresolution Analysis and Run length Encoding Raneet Kumar

More information

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter

Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Perceptual Speech Enhancement Using Multi_band Spectral Attenuation Filter Sana Alaya, Novlène Zoghlami and Zied Lachiri Signal, Image and Information Technology Laboratory National Engineering School

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM

HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM HIGH QUALITY AUDIO CODING AT LOW BIT RATE USING WAVELET AND WAVELET PACKET TRANSFORM DR. D.C. DHUBKARYA AND SONAM DUBEY 2 Email at: sonamdubey2000@gmail.com, Electronic and communication department Bundelkhand

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Audio and Speech Compression Using DCT and DWT Techniques

Audio and Speech Compression Using DCT and DWT Techniques Audio and Speech Compression Using DCT and DWT Techniques M. V. Patil 1, Apoorva Gupta 2, Ankita Varma 3, Shikhar Salil 4 Asst. Professor, Dept.of Elex, Bharati Vidyapeeth Univ.Coll.of Engg, Pune, Maharashtra,

More information

Evaluation of Audio Compression Artifacts M. Herrera Martinez

Evaluation of Audio Compression Artifacts M. Herrera Martinez Evaluation of Audio Compression Artifacts M. Herrera Martinez This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Wideband Speech Coding & Its Application

Wideband Speech Coding & Its Application Wideband Speech Coding & Its Application Apeksha B. landge. M.E. [student] Aditya Engineering College Beed Prof. Amir Lodhi. Guide & HOD, Aditya Engineering College Beed ABSTRACT: Increasing the bandwidth

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

Comparative Analysis between DWT and WPD Techniques of Speech Compression

Comparative Analysis between DWT and WPD Techniques of Speech Compression IOSR Journal of Engineering (IOSRJEN) ISSN: 225-321 Volume 2, Issue 8 (August 212), PP 12-128 Comparative Analysis between DWT and WPD Techniques of Speech Compression Preet Kaur 1, Pallavi Bahl 2 1 (Assistant

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

Speech Compression Using Wavelet Transform

Speech Compression Using Wavelet Transform IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 3, Ver. VI (May - June 2017), PP 33-41 www.iosrjournals.org Speech Compression Using Wavelet Transform

More information

Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor

Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor A Novel Approach for Waveform Compression Dilpreet Singh 1, Parminder Singh 2 1 M.Tech. Student, 2 Associate Professor CSE Department, Guru Nanak Dev Engineering College, Ludhiana Abstract Waveform Compression

More information

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS

MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction

Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 7, Issue, Ver. I (Mar. - Apr. 7), PP 4-46 e-issn: 9 4, p-issn No. : 9 497 www.iosrjournals.org Speech Enhancement Using Spectral Flatness Measure

More information

SPEECH COMPRESSION USING WAVELETS

SPEECH COMPRESSION USING WAVELETS SPEECH COMPRESSION USING WAVELETS HATEM ELAYDI Electrical & Computer Engineering Department Islamic University of Gaza Gaza, Palestine helaydi@mail.iugaza.edu MUSTAFA I. JABER Electrical & Computer Engineering

More information

Chapter 2: Digitization of Sound

Chapter 2: Digitization of Sound Chapter 2: Digitization of Sound Acoustics pressure waves are converted to electrical signals by use of a microphone. The output signal from the microphone is an analog signal, i.e., a continuous-valued

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

Audio Compression using the MLT and SPIHT

Audio Compression using the MLT and SPIHT Audio Compression using the MLT and SPIHT Mohammed Raad, Alfred Mertins and Ian Burnett School of Electrical, Computer and Telecommunications Engineering University Of Wollongong Northfields Ave Wollongong

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

FPGA implementation of DWT for Audio Watermarking Application

FPGA implementation of DWT for Audio Watermarking Application FPGA implementation of DWT for Audio Watermarking Application Naveen.S.Hampannavar 1, Sajeevan Joseph 2, C.B.Bidhul 3, Arunachalam V 4 1, 2, 3 M.Tech VLSI Students, 4 Assistant Professor Selection Grade

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Department of Electronics and Communication Engineering 1

Department of Electronics and Communication Engineering 1 UNIT I SAMPLING AND QUANTIZATION Pulse Modulation 1. Explain in detail the generation of PWM and PPM signals (16) (M/J 2011) 2. Explain in detail the concept of PWM and PAM (16) (N/D 2012) 3. What is the

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

Chapter 9 Image Compression Standards

Chapter 9 Image Compression Standards Chapter 9 Image Compression Standards 9.1 The JPEG Standard 9.2 The JPEG2000 Standard 9.3 The JPEG-LS Standard 1IT342 Image Compression Standards The image standard specifies the codec, which defines how

More information

Audio Watermarking Scheme in MDCT Domain

Audio Watermarking Scheme in MDCT Domain Santosh Kumar Singh and Jyotsna Singh Electronics and Communication Engineering, Netaji Subhas Institute of Technology, Sec. 3, Dwarka, New Delhi, 110078, India. E-mails: ersksingh_mtnl@yahoo.com & jsingh.nsit@gmail.com

More information

Introduction to Audio Watermarking Schemes

Introduction to Audio Watermarking Schemes Introduction to Audio Watermarking Schemes N. Lazic and P. Aarabi, Communication over an Acoustic Channel Using Data Hiding Techniques, IEEE Transactions on Multimedia, Vol. 8, No. 5, October 2006 Multimedia

More information

Detection, localization, and classification of power quality disturbances using discrete wavelet transform technique

Detection, localization, and classification of power quality disturbances using discrete wavelet transform technique From the SelectedWorks of Tarek Ibrahim ElShennawy 2003 Detection, localization, and classification of power quality disturbances using discrete wavelet transform technique Tarek Ibrahim ElShennawy, Dr.

More information

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC Jimmy Lapierre 1, Roch Lefebvre 1, Bruno Bessette 1, Vladimir Malenovsky 1, Redwan Salami 2 1 Université de Sherbrooke, Sherbrooke (Québec),

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

Introduction of Audio and Music

Introduction of Audio and Music 1 Introduction of Audio and Music Wei-Ta Chu 2009/12/3 Outline 2 Introduction of Audio Signals Introduction of Music 3 Introduction of Audio Signals Wei-Ta Chu 2009/12/3 Li and Drew, Fundamentals of Multimedia,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

10 Speech and Audio Signals

10 Speech and Audio Signals 0 Speech and Audio Signals Introduction Speech and audio signals are normally converted into PCM, which can be stored or transmitted as a PCM code, or compressed to reduce the number of bits used to code

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking

ScienceDirect. Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 122 126 International Conference on Information and Communication Technologies (ICICT 2014) Unsupervised Speech

More information

Transcoding of Narrowband to Wideband Speech

Transcoding of Narrowband to Wideband Speech University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2005 Transcoding of Narrowband to Wideband Speech Christian H. Ritz University

More information

IN RECENT YEARS, there has been a great deal of interest

IN RECENT YEARS, there has been a great deal of interest IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL 12, NO 1, JANUARY 2004 9 Signal Modification for Robust Speech Coding Nam Soo Kim, Member, IEEE, and Joon-Hyuk Chang, Member, IEEE Abstract Usually,

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

EC 2301 Digital communication Question bank

EC 2301 Digital communication Question bank EC 2301 Digital communication Question bank UNIT I Digital communication system 2 marks 1.Draw block diagram of digital communication system. Information source and input transducer formatter Source encoder

More information

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder Jing Wang, Jingg Kuang, and Shenghui Zhao Research Center of Digital Communication Technology,Department of Electronic

More information

High capacity robust audio watermarking scheme based on DWT transform

High capacity robust audio watermarking scheme based on DWT transform High capacity robust audio watermarking scheme based on DWT transform Davod Zangene * (Sama technical and vocational training college, Islamic Azad University, Mahshahr Branch, Mahshahr, Iran) davodzangene@mail.com

More information

Wavelet-based image compression

Wavelet-based image compression Institut Mines-Telecom Wavelet-based image compression Marco Cagnazzo Multimedia Compression Outline Introduction Discrete wavelet transform and multiresolution analysis Filter banks and DWT Multiresolution

More information

Efficient Image Compression Technique using JPEG2000 with Adaptive Threshold

Efficient Image Compression Technique using JPEG2000 with Adaptive Threshold Efficient Image Compression Technique using JPEG2000 with Adaptive Threshold Md. Masudur Rahman Mawlana Bhashani Science and Technology University Santosh, Tangail-1902 (Bangladesh) Mohammad Motiur Rahman

More information

REAL-TIME BROADBAND NOISE REDUCTION

REAL-TIME BROADBAND NOISE REDUCTION REAL-TIME BROADBAND NOISE REDUCTION Robert Hoeldrich and Markus Lorber Institute of Electronic Music Graz Jakoministrasse 3-5, A-8010 Graz, Austria email: robert.hoeldrich@mhsg.ac.at Abstract A real-time

More information

Advances in Applied and Pure Mathematics

Advances in Applied and Pure Mathematics Enhancement of speech signal based on application of the Maximum a Posterior Estimator of Magnitude-Squared Spectrum in Stationary Bionic Wavelet Domain MOURAD TALBI, ANIS BEN AICHA 1 mouradtalbi196@yahoo.fr,

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Speech Compression for Better Audibility Using Wavelet Transformation with Adaptive Kalman Filtering

Speech Compression for Better Audibility Using Wavelet Transformation with Adaptive Kalman Filtering Speech Compression for Better Audibility Using Wavelet Transformation with Adaptive Kalman Filtering P. Sunitha 1, Satya Prasad Chitneedi 2 1 Assoc. Professor, Department of ECE, Pragathi Engineering College,

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

A COMPARATIVE ANALYSIS OF DCT AND DWT BASED FOR IMAGE COMPRESSION ON FPGA

A COMPARATIVE ANALYSIS OF DCT AND DWT BASED FOR IMAGE COMPRESSION ON FPGA International Journal of Applied Engineering Research and Development (IJAERD) ISSN:2250 1584 Vol.2, Issue 1 (2012) 13-21 TJPRC Pvt. Ltd., A COMPARATIVE ANALYSIS OF DCT AND DWT BASED FOR IMAGE COMPRESSION

More information

New algorithm for QMF Banks Design and Its Application in Speech Compression using DWT

New algorithm for QMF Banks Design and Its Application in Speech Compression using DWT 86 The International Arab Journal of Information Technology, Vol. 1, No.1, January 015 New algorithm for QMF Banks Design and Its Application in Speech Compression using DWT Noureddine Aloui, Chafik Barnoussi

More information

(Refer Slide Time: 3:11)

(Refer Slide Time: 3:11) Digital Communication. Professor Surendra Prasad. Department of Electrical Engineering. Indian Institute of Technology, Delhi. Lecture-2. Digital Representation of Analog Signals: Delta Modulation. Professor:

More information

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION

TE 302 DISCRETE SIGNALS AND SYSTEMS. Chapter 1: INTRODUCTION TE 302 DISCRETE SIGNALS AND SYSTEMS Study on the behavior and processing of information bearing functions as they are currently used in human communication and the systems involved. Chapter 1: INTRODUCTION

More information

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat Audio Transmission Technology for Multi-point Mobile Voice Chat Voice Chat Multi-channel Coding Binaural Signal Processing Audio Transmission Technology for Multi-point Mobile Voice Chat We have developed

More information

Chapter 4. Digital Audio Representation CS 3570

Chapter 4. Digital Audio Representation CS 3570 Chapter 4. Digital Audio Representation CS 3570 1 Objectives Be able to apply the Nyquist theorem to understand digital audio aliasing. Understand how dithering and noise shaping are done. Understand the

More information

NOISE ESTIMATION IN A SINGLE CHANNEL

NOISE ESTIMATION IN A SINGLE CHANNEL SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina

More information

Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder

Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder Ryosue Sugiura, Yutaa Kamamoto, Noboru Harada, Hiroazu Kameoa and Taehiro Moriya Graduate School of Information Science and Technology,

More information

EEE 309 Communication Theory

EEE 309 Communication Theory EEE 309 Communication Theory Semester: January 2016 Dr. Md. Farhad Hossain Associate Professor Department of EEE, BUET Email: mfarhadhossain@eee.buet.ac.bd Office: ECE 331, ECE Building Part 05 Pulse Code

More information

HCS 7367 Speech Perception

HCS 7367 Speech Perception HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based

More information

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement

Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement Frequency Domain Analysis for Noise Suppression Using Spectral Processing Methods for Degraded Speech Signal in Speech Enhancement 1 Zeeshan Hashmi Khateeb, 2 Gopalaiah 1,2 Department of Instrumentation

More information

An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec

An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec An objective method for evaluating data hiding in pitch gain and pitch delay parameters of the AMR codec Akira Nishimura 1 1 Department of Media and Cultural Studies, Tokyo University of Information Sciences,

More information

Speech Enhancement Based On Noise Reduction

Speech Enhancement Based On Noise Reduction Speech Enhancement Based On Noise Reduction Kundan Kumar Singh Electrical Engineering Department University Of Rochester ksingh11@z.rochester.edu ABSTRACT This paper addresses the problem of signal distortion

More information

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008 R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath

More information

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches

Performance study of Text-independent Speaker identification system using MFCC & IMFCC for Telephone and Microphone Speeches Performance study of Text-independent Speaker identification system using & I for Telephone and Microphone Speeches Ruchi Chaudhary, National Technical Research Organization Abstract: A state-of-the-art

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

ELEC9344:Speech & Audio Processing. Chapter 13 (Week 13) Professor E. Ambikairajah. UNSW, Australia. Auditory Masking

ELEC9344:Speech & Audio Processing. Chapter 13 (Week 13) Professor E. Ambikairajah. UNSW, Australia. Auditory Masking ELEC9344:Speech & Audio Processing Chapter 13 (Week 13) Auditory Masking Anatomy of the ear The ear divided into three sections: The outer Middle Inner ear (see next slide) The outer ear is terminated

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking The 7th International Conference on Signal Processing Applications & Technology, Boston MA, pp. 476-480, 7-10 October 1996. Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic

More information

arxiv: v1 [cs.it] 9 Mar 2016

arxiv: v1 [cs.it] 9 Mar 2016 A Novel Design of Linear Phase Non-uniform Digital Filter Banks arxiv:163.78v1 [cs.it] 9 Mar 16 Sakthivel V, Elizabeth Elias Department of Electronics and Communication Engineering, National Institute

More information

Speech/Music Change Point Detection using Sonogram and AANN

Speech/Music Change Point Detection using Sonogram and AANN International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 6, Number 1 (2016), pp. 45-49 International Research Publications House http://www. irphouse.com Speech/Music Change

More information

MPEG-4 Structured Audio Systems

MPEG-4 Structured Audio Systems MPEG-4 Structured Audio Systems Mihir Anandpara The University of Texas at Austin anandpar@ece.utexas.edu 1 Abstract The MPEG-4 standard has been proposed to provide high quality audio and video content

More information

Wavelet Speech Enhancement based on the Teager Energy Operator

Wavelet Speech Enhancement based on the Teager Energy Operator Wavelet Speech Enhancement based on the Teager Energy Operator Mohammed Bahoura and Jean Rouat ERMETIS, DSA, Université du Québec à Chicoutimi, Chicoutimi, Québec, G7H 2B1, Canada. Abstract We propose

More information

Distributed Speech Recognition Standardization Activity

Distributed Speech Recognition Standardization Activity Distributed Speech Recognition Standardization Activity Alex Sorin, Ron Hoory, Dan Chazan Telecom and Media Systems Group June 30, 2003 IBM Research Lab in Haifa Advanced Speech Enabled Services ASR App

More information

Modulator Domain Adaptive Gain Equalizer for Speech Enhancement

Modulator Domain Adaptive Gain Equalizer for Speech Enhancement Modulator Domain Adaptive Gain Equalizer for Speech Enhancement Ravindra d. Dhage, Prof. Pravinkumar R.Badadapure Abstract M.E Scholar, Professor. This paper presents a speech enhancement method for personal

More information

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre

More information

General outline of HF digital radiotelephone systems

General outline of HF digital radiotelephone systems Rec. ITU-R F.111-1 1 RECOMMENDATION ITU-R F.111-1* DIGITIZED SPEECH TRANSMISSIONS FOR SYSTEMS OPERATING BELOW ABOUT 30 MHz (Question ITU-R 164/9) Rec. ITU-R F.111-1 (1994-1995) The ITU Radiocommunication

More information

Introduction to Wavelet Transform. Chapter 7 Instructor: Hossein Pourghassem

Introduction to Wavelet Transform. Chapter 7 Instructor: Hossein Pourghassem Introduction to Wavelet Transform Chapter 7 Instructor: Hossein Pourghassem Introduction Most of the signals in practice, are TIME-DOMAIN signals in their raw format. It means that measured signal is a

More information

YOUR WAVELET BASED PITCH DETECTION AND VOICED/UNVOICED DECISION

YOUR WAVELET BASED PITCH DETECTION AND VOICED/UNVOICED DECISION American Journal of Engineering and Technology Research Vol. 3, No., 03 YOUR WAVELET BASED PITCH DETECTION AND VOICED/UNVOICED DECISION Yinan Kong Department of Electronic Engineering, Macquarie University

More information

Downloaded from 1

Downloaded from  1 VII SEMESTER FINAL EXAMINATION-2004 Attempt ALL questions. Q. [1] How does Digital communication System differ from Analog systems? Draw functional block diagram of DCS and explain the significance of

More information

COMBINING ADVANCED SINUSOIDAL AND WAVEFORM MATCHING MODELS FOR PARAMETRIC AUDIO/SPEECH CODING

COMBINING ADVANCED SINUSOIDAL AND WAVEFORM MATCHING MODELS FOR PARAMETRIC AUDIO/SPEECH CODING 17th European Signal Processing Conference (EUSIPCO 29) Glasgow, Scotland, August 24-28, 29 COMBINING ADVANCED SINUSOIDAL AND WAVEFORM MATCHING MODELS FOR PARAMETRIC AUDIO/SPEECH CODING Alexey Petrovsky

More information

THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION

THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION THE STATISTICAL ANALYSIS OF AUDIO WATERMARKING USING THE DISCRETE WAVELETS TRANSFORM AND SINGULAR VALUE DECOMPOSITION Mr. Jaykumar. S. Dhage Assistant Professor, Department of Computer Science & Engineering

More information