Speech Coding using Linear Prediction

Size: px
Start display at page:

Download "Speech Coding using Linear Prediction"

Transcription

1 Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen September 10, Background Speech is generated when air is pushed from the lungs through the vocal tract. The produced speech is typically divided into voiced and unvoiced sounds. If the vocal chords in the vocal tract vibrate rapidly when air is pushed through them, the sound is voiced. Examples of voiced sounds are vocals such as a, o, and i. On the other hand, if the vocal chords are constantly open, the sound is unvoiced. Examples of unvoiced sounds are p, s, and q. Much more about human speech production can be found in [1]. The two kinds of speech sounds can be used as sources in the mathematical model of speech production in Fig. 1. In the model, either an impulse train or a white noise process is filtered through an all-pole filter with system response H 1 (z). The impulse train is the source signal for the voiced speech sounds, and the white noise is the source signal for the unvoiced speech sounds. The all-pole filter models how the cavities in the throat, mouth, and nose as well as the position of the tongue change the spectral envelope of the pulse train and the white noise process. For example, if a person is having a cold, the cavities in the nose change, and this changes the spectral envelope of the produced speech. A method called linear prediction [?,2 4] can be used to estimate the filter coefficient of this all-pole filter from a recorded segment for speech, and this is exploited in many practical applications such as speech compression in digital cellular technology and voice over IP. To understand why the mathematical model in Fig. 1 is so useful for speech compression, consider a simple telephone call between Peter and Karen, say. When Peter is speaking, the microphone in the telephone converts the pressure variations (Peter s voice) into voltage variations. Both the pressures and the voltages are continuous in time and in amplitude. In the telephone, an analogue-to-digital converter (ADC) measures the value of the voltage at a uniform rate called the sampling frequency. One such value is typically called a sample. For telephone applications, this sampling frequency is typically 8000 Hz. Moreover, the measured voltage is also rounded to the nearest value on a grid so that the measured voltages can be represented with a fixed number of bits. This rounding is called quantisation. If we assume that 8 bits are used to represent the value 1

2 Pulse train : Voiced speech : Unvoiced speech H -1 (z) Speech x(n) White noise Figure 1: A popular speech model. of each measured value of the voltage, the total bitrate b is b = 8 khz 8 bits = 64 kbit/s. (1) Thus, Peter s telephone has so far converted the pressure variations generated by Peter s voice into a sampled and quantised signal which we denote as x(n) where n is the time index or sample number. To simplify things, the speech production is often modelled so that it generates x(n) directly. This is also done in Fig. 1. In principle, Peter s phone could now transmit the sampled and quantised speech waveform to Karen s phone at a bitrate of 64 kbit/s. However, the model in Fig. 1 can be used to compress the speech data so that the bitrate is reduced by a significant amount. The compression can be performed in the following way. 1. Divide the speech signal into small segments of 20 ms, say. 2. For each segment, estimate the filter coefficients of the all-pole filter from the segment of speech. Typically, approximately 12 filter coefficients are used in the filter. 3. Determine whether the speech segment is voiced or unvoiced. 4. If the segment is unvoiced, estimate the variance of the white noise process responsible for generating the speech signal. If the signal is voiced, estimate the amplitude and the distance between the pulses in the pulse train. 5. Transmit the estimated filter coefficients, the speech type, and the source signal parameters. 2

3 At a sampling frequency of 20 khz, a speech segment of 20 ms corresponds to 160 samples. For a resolution of 8 bits/sample, 1280 bits must therefore be used to represent this speech segment if no compression is used. If we instead compress the 12 filter coefficients and the source signal parameters (also with 8 bit), only 112 bits are required for the speech segment. Thus, we achieve a compression factor of more than a factor of 10. The compression scheme described above is illustrated in Fig. 2. Figure 2: Block diagram of a generic speech coding application. In practical telephone applications, the source signals are encoded in a more complex way than described above. However, state-of-the-art speech coders such as code excited linear prediction (CELP) coders are based on the same principles and the model as above and achieve a bit rate of approximate 4.8 kbit/s. 2 Problem In the project, we will focus on estimating the filter coefficients of the all-pole filter from the speech signal so that the residual energy e 2 (n) averaged over the whole segment of data is as small as possible. This problem is the so-called linear prediction problem, and we can formulate it mathematically in the following way. Assume that a speech segment of data consists in N data points which we model as x(n) = a 1 x(n 1) + a 2 x(n 2) + + a p x(n p) + e(n) (2) p = a i x(n i) + e(n) (3) i=1 = h T (n)a + e(n) (4) 3

4 for n = 0, 1,..., N 1 where ( ) T denotes matrix transposition and a = [ a 1 a 2 a p ] T (5) h(n) = [ x(n 1) x(n 2) x(n p) ] T. (6) When the residual e(n) is a white and a wide sense stationary (WSS) process, the speech signal is modelled as a so-called autoregressive random process of order p. This means that if the speech signal was indeed an autoregressive process, the output of the filter with the system response H(z) would be white and WSS, and this is the main idea behind speech compression based on linear prediction. As illustrated in Fig. 2 and described above, the speech signal is not transmitted directly. Instead, the filter coefficients a and the residuals are transmitted as this can be performed at a much lower bit rate than by direcly transmitting the speech signal. From a mathematical perspectively, the autoregressive model can be formulated as a linear normal model. In matrix notation, the linear model can be written as where x = Ha + e (7) x = [ x(0) x(1) x(n 1) ] T e = [ e(0) e(1) e(n 1) ] T H = [ h(0) h(1) h(n 1) ] T. (10) Under some technical conditions on the matrix H, minimising the two-norm of e w.r.t. a leads to the so-called least squares estimate of a given by â = (H T H) 1 H T x. (11) Moreover, this estimate can be shown to be the so-called conditional maximum likelihood estimate. From an engineering perspective, the least squares estimate above might be problematic since the estimated filter coefficients in â are not guaranteed to produce a stable all-pole filter. That is, some of the poles of the system response H 1 (z) might be outside the unit circle. This can be compensated for in various ways by reflecting the problematic poles around the unit circle or by defining the matrix H so that H T H has a so-called Toeplitz structure. 2.1 Packet-loss concealment Returning to the example of Peter and Karen s telephone conversation, the estimated filter coefficients and source signal parameters from each segment are packaged in a speech packet and transmitted from one phone to the other. In Fig. 2, this is illustrated by the two antennas in the transmission part of the figure. Unfortunately, speech packets might be lost, corrupted, or delayed (8) (9) 4

5 in the transmission, and this will produce an audible click when the receiving telephone plays back the received speech. However, since correlation exists between the adjacent speech segments, the content of a missing speech packet can to some extend be predicted from adjacent speech packets. This prediction problem is called packet-loss concealment and can also be a part of the project. 3 Data Set The data set is based on a secret track of male speech and consists of three files, all sampled at 8000 Hz. - unvoicedsegment.wav: 120 ms of unvoiced speech. Specifically, the segment is the s-sound from the word score. - voicedsegment.wav: 260 ms of voiced speech. Specifically, the segment is the ore-sound from the word score. - sentences.wav: A sentence of approximately 5 seconds of speech containing a mixture of voiced and unvoiced speech. The unvoiced and voiced segments are taken from the word score in this file. If you listen carefully to the last file, you will hear to audible clicks. These clicks are caused by two missing audio segments of 20 ms. These segments are located directly after the voiced and unvoiced segments, respectively, and can be approximated using packet loss concealment. In the file sentences.wav, the missing packets are from sample number to and from sample number to In MATLAB, an audio file can be loaded via [data, samplingfreq] = audioread( filename.wav ); 4 Tools from Courses During the project, the following tools, which you will learn about in the courses on the semester, are needed. Solving a 2-norm optimisation problems. Understand the differences between and similarities of the maximum likelihood estimator and the least square estimator. Model signals as autoregressive processes. Analyse linear normal models. 5

6 References [1] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, 1st ed. Prentice-Hall Inc, [2] J. E. Markel and A. H. Gray, Linear prediction of speech. Springer-Verlag New York, Inc., [3] S. Haykin, Adaptive Filter Theory, 4th ed. Prentice Hall, Sep [4] J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-time processing of speech signals. IEEE New York, NY, USA, [5] D. Giacobello, Sparsity in linear predictive coding of speech, Ph.D. dissertation, Aalborg University,

Adaptive Filters Linear Prediction

Adaptive Filters Linear Prediction Adaptive Filters Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Slide 1 Contents

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Wideband Speech Coding & Its Application

Wideband Speech Coding & Its Application Wideband Speech Coding & Its Application Apeksha B. landge. M.E. [student] Aditya Engineering College Beed Prof. Amir Lodhi. Guide & HOD, Aditya Engineering College Beed ABSTRACT: Increasing the bandwidth

More information

17. Delta Modulation

17. Delta Modulation 7. Delta Modulation Introduction So far, we have seen that the pulse-code-modulation (PCM) technique converts analogue signals to digital format for transmission. For speech signals of 3.2kHz bandwidth,

More information

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Nanda Prasetiyo Koestoer B. Eng (Hon) (1998) School of Microelectronic Engineering Faculty of Engineering and Information Technology Griffith

More information

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech

Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech Speech Enhancement: Reduction of Additive Noise in the Digital Processing of Speech Project Proposal Avner Halevy Department of Mathematics University of Maryland, College Park ahalevy at math.umd.edu

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Cepstrum alanysis of speech signals

Cepstrum alanysis of speech signals Cepstrum alanysis of speech signals ELEC-E5520 Speech and language processing methods Spring 2016 Mikko Kurimo 1 /48 Contents Literature and other material Idea and history of cepstrum Cepstrum and LP

More information

Exam in 1TT850, 1E275. Modulation, Demodulation and Coding course

Exam in 1TT850, 1E275. Modulation, Demodulation and Coding course Exam in 1TT850, 1E275 Modulation, Demodulation and Coding course EI, TF, IT programs 16th of August 2004, 14:00-19:00 Signals and systems, Uppsala university Examiner Sorour Falahati office: 018-471 3071

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Signal Processing in Acoustics Session 2pSP: Acoustic Signal Processing

More information

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Un. of Rome La Sapienza Chiara Petrioli Department of Computer Science University of Rome Sapienza Italy 2 Voice Coding 3 Speech signals Voice coding:

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations

More information

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods Tools and Applications Chapter Intended Learning Outcomes: (i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1

More information

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications

Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Brochure More information from http://www.researchandmarkets.com/reports/569388/ Multimedia Signal Processing: Theory and Applications in Speech, Music and Communications Description: Multimedia Signal

More information

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065

Speech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 Speech Processing Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 All course materials and handouts are the same for both versions. Differences: credits (20 for UG, 10 for PG);

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

Page 0 of 23. MELP Vocoder

Page 0 of 23. MELP Vocoder Page 0 of 23 MELP Vocoder Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

Microcomputer Systems 1. Introduction to DSP S

Microcomputer Systems 1. Introduction to DSP S Microcomputer Systems 1 Introduction to DSP S Introduction to DSP s Definition: DSP Digital Signal Processing/Processor It refers to: Theoretical signal processing by digital means (subject of ECE3222,

More information

Surveillance Transmitter of the Future. Abstract

Surveillance Transmitter of the Future. Abstract Surveillance Transmitter of the Future Eric Pauer DTC Communications Inc. Ronald R Young DTC Communications Inc. 486 Amherst Street Nashua, NH 03062, Phone; 603-880-4411, Fax; 603-880-6965 Elliott Lloyd

More information

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or

NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying

More information

COMP 546, Winter 2017 lecture 20 - sound 2

COMP 546, Winter 2017 lecture 20 - sound 2 Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering

More information

Waveform Encoding - PCM. BY: Dr.AHMED ALKHAYYAT. Chapter Two

Waveform Encoding - PCM. BY: Dr.AHMED ALKHAYYAT. Chapter Two Chapter Two Layout: 1. Introduction. 2. Pulse Code Modulation (PCM). 3. Differential Pulse Code Modulation (DPCM). 4. Delta modulation. 5. Adaptive delta modulation. 6. Sigma Delta Modulation (SDM). 7.

More information

Pitch Period of Speech Signals Preface, Determination and Transformation

Pitch Period of Speech Signals Preface, Determination and Transformation Pitch Period of Speech Signals Preface, Determination and Transformation Mohammad Hossein Saeidinezhad 1, Bahareh Karamsichani 2, Ehsan Movahedi 3 1 Islamic Azad university, Najafabad Branch, Saidinezhad@yahoo.com

More information

AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS

AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS AN AUTOREGRESSIVE BASED LFM REVERBERATION SUPPRESSION FOR RADAR AND SONAR APPLICATIONS MrPMohan Krishna 1, AJhansi Lakshmi 2, GAnusha 3, BYamuna 4, ASudha Rani 5 1 Asst Professor, 2,3,4,5 Student, Dept

More information

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec Audio codec desenvolupat per Skype (Febrer 2009) Previament usaven el codec SVOPC (Sinusoidal Voice Over Packet Coder): LPC analysis.

More information

Audio /Video Signal Processing. Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau

Audio /Video Signal Processing. Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau Audio /Video Signal Processing Lecture 1, Organisation, A/D conversion, Sampling Gerald Schuller, TU Ilmenau Gerald Schuller gerald.schuller@tu ilmenau.de Organisation: Lecture each week, 2SWS, Seminar

More information

Digital Signal Processing

Digital Signal Processing COMP ENG 4TL4: Digital Signal Processing Notes for Lecture #27 Tuesday, November 11, 23 6. SPECTRAL ANALYSIS AND ESTIMATION 6.1 Introduction to Spectral Analysis and Estimation The discrete-time Fourier

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

10 Speech and Audio Signals

10 Speech and Audio Signals 0 Speech and Audio Signals Introduction Speech and audio signals are normally converted into PCM, which can be stored or transmitted as a PCM code, or compressed to reduce the number of bits used to code

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

SGN Audio and Speech Processing

SGN Audio and Speech Processing SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Digital Signal Representation of Speech Signal

Digital Signal Representation of Speech Signal Digital Signal Representation of Speech Signal Mrs. Smita Chopde 1, Mrs. Pushpa U S 2 1,2. EXTC Department, Mumbai University Abstract Delta modulation is a waveform coding techniques which the data rate

More information

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis

Signal Analysis. Peak Detection. Envelope Follower (Amplitude detection) Music 270a: Signal Analysis Signal Analysis Music 27a: Signal Analysis Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD November 23, 215 Some tools we may want to use to automate analysis

More information

OF HIGH QUALITY AUDIO SIGNALS

OF HIGH QUALITY AUDIO SIGNALS COMPRESSION OF HIGH QUALITY AUDIO SIGNALS 1. Description of the problem Fairlight Instruments, who brought the problem to the MISG, have developed a high quality "Computer Musical Instrument" (CMI) which

More information

General outline of HF digital radiotelephone systems

General outline of HF digital radiotelephone systems Rec. ITU-R F.111-1 1 RECOMMENDATION ITU-R F.111-1* DIGITIZED SPEECH TRANSMISSIONS FOR SYSTEMS OPERATING BELOW ABOUT 30 MHz (Question ITU-R 164/9) Rec. ITU-R F.111-1 (1994-1995) The ITU Radiocommunication

More information

Detection, Interpolation and Cancellation Algorithms for GSM burst Removal for Forensic Audio

Detection, Interpolation and Cancellation Algorithms for GSM burst Removal for Forensic Audio >Bitzer and Rademacher (Paper Nr. 21)< 1 Detection, Interpolation and Cancellation Algorithms for GSM burst Removal for Forensic Audio Joerg Bitzer and Jan Rademacher Abstract One increasing problem for

More information

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function

Determination of instants of significant excitation in speech using Hilbert envelope and group delay function Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,

More information

Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality MDCT Coding Mode of The 3GPP EVS Codec

Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality MDCT Coding Mode of The 3GPP EVS Codec Super-Wideband Fine Spectrum Quantization for Low-rate High-Quality DCT Coding ode of The 3GPP EVS Codec Presented by Srikanth Nagisetty, Hiroyuki Ehara 15 th Dec 2015 Topics of this Presentation Background

More information

Applications of Music Processing

Applications of Music Processing Lecture Music Processing Applications of Music Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Singing Voice Detection Important pre-requisite

More information

Chapter 2: Digitization of Sound

Chapter 2: Digitization of Sound Chapter 2: Digitization of Sound Acoustics pressure waves are converted to electrical signals by use of a microphone. The output signal from the microphone is an analog signal, i.e., a continuous-valued

More information

NOISE ESTIMATION IN A SINGLE CHANNEL

NOISE ESTIMATION IN A SINGLE CHANNEL SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina

More information

Fundamental frequency estimation of speech signals using MUSIC algorithm

Fundamental frequency estimation of speech signals using MUSIC algorithm Acoust. Sci. & Tech. 22, 4 (2) TECHNICAL REPORT Fundamental frequency estimation of speech signals using MUSIC algorithm Takahiro Murakami and Yoshihisa Ishida School of Science and Technology, Meiji University,,

More information

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter

Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter Speech Enhancement in Presence of Noise using Spectral Subtraction and Wiener Filter 1 Gupteswar Sahu, 2 D. Arun Kumar, 3 M. Bala Krishna and 4 Jami Venkata Suman Assistant Professor, Department of ECE,

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

EEE 309 Communication Theory

EEE 309 Communication Theory EEE 309 Communication Theory Semester: January 2016 Dr. Md. Farhad Hossain Associate Professor Department of EEE, BUET Email: mfarhadhossain@eee.buet.ac.bd Office: ECE 331, ECE Building Part 05 Pulse Code

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Isolated Digit Recognition Using MFCC AND DTW

Isolated Digit Recognition Using MFCC AND DTW MarutiLimkar a, RamaRao b & VidyaSagvekar c a Terna collegeof Engineering, Department of Electronics Engineering, Mumbai University, India b Vidyalankar Institute of Technology, Department ofelectronics

More information

Improving Sound Quality by Bandwidth Extension

Improving Sound Quality by Bandwidth Extension International Journal of Scientific & Engineering Research, Volume 3, Issue 9, September-212 Improving Sound Quality by Bandwidth Extension M. Pradeepa, M.Tech, Assistant Professor Abstract - In recent

More information

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of COMPRESSIVE SAMPLING OF SPEECH SIGNALS by Mona Hussein Ramadan BS, Sebha University, 25 Submitted to the Graduate Faculty of Swanson School of Engineering in partial fulfillment of the requirements for

More information

6/29 Vol.7, No.2, February 2012

6/29 Vol.7, No.2, February 2012 Synthesis Filter/Decoder Structures in Speech Codecs Jerry D. Gibson, Electrical & Computer Engineering, UC Santa Barbara, CA, USA gibson@ece.ucsb.edu Abstract Using the Shannon backward channel result

More information

Voice Activity Detection

Voice Activity Detection Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class

More information

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION

TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION TRANSIENT NOISE REDUCTION BASED ON SPEECH RECONSTRUCTION Jian Li 1,2, Shiwei Wang 1,2, Renhua Peng 1,2, Chengshi Zheng 1,2, Xiaodong Li 1,2 1. Communication Acoustics Laboratory, Institute of Acoustics,

More information

EEE482F: Problem Set 1

EEE482F: Problem Set 1 EEE482F: Problem Set 1 1. A digital source emits 1.0 and 0.0V levels with a probability of 0.2 each, and +3.0 and +4.0V levels with a probability of 0.3 each. Evaluate the average information of the source.

More information

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta

Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied

More information

Speech Recognition using FIR Wiener Filter

Speech Recognition using FIR Wiener Filter Speech Recognition using FIR Wiener Filter Deepak 1, Vikas Mittal 2 1 Department of Electronics & Communication Engineering, Maharishi Markandeshwar University, Mullana (Ambala), INDIA 2 Department of

More information

Multiplexing Concepts and Introduction to BISDN. Professor Richard Harris

Multiplexing Concepts and Introduction to BISDN. Professor Richard Harris Multiplexing Concepts and Introduction to BISDN Professor Richard Harris Objectives Define what is meant by multiplexing and demultiplexing Identify the main types of multiplexing Space Division Time Division

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

A NEW FEATURE VECTOR FOR HMM-BASED PACKET LOSS CONCEALMENT

A NEW FEATURE VECTOR FOR HMM-BASED PACKET LOSS CONCEALMENT A NEW FEATURE VECTOR FOR HMM-BASED PACKET LOSS CONCEALMENT L. Koenig (,2,3), R. André-Obrecht (), C. Mailhes (2) and S. Fabre (3) () University of Toulouse, IRIT/UPS, 8 Route de Narbonne, F-362 TOULOUSE

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Proceedings of the 2 nd International Conference on Current Trends in Engineering and Management ICCTEM -214 ISSN

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection Detection Lecture usic Processing Applications of usic Processing Christian Dittmar International Audio Laboratories Erlangen christian.dittmar@audiolabs-erlangen.de Important pre-requisite for: usic segmentation

More information

Introduction to cochlear implants Philipos C. Loizou Figure Captions

Introduction to cochlear implants Philipos C. Loizou Figure Captions http://www.utdallas.edu/~loizou/cimplants/tutorial/ Introduction to cochlear implants Philipos C. Loizou Figure Captions Figure 1. The top panel shows the time waveform of a 30-msec segment of the vowel

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Adaptive Feedback Cancellation With Band-Limited LPC Vocoder in Digital Hearing Aids

Adaptive Feedback Cancellation With Band-Limited LPC Vocoder in Digital Hearing Aids Downloaded from orbit.dtu.dk on: Dec 15, 2017 Adaptive Feedback Cancellation With Band-Limited LPC Vocoder in Digital Hearing Aids Guilin, Ma; Gran, Fredrik; Jacobsen, Finn; Agerkvist, Finn T. Published

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

RECENTLY, there has been an increasing interest in noisy

RECENTLY, there has been an increasing interest in noisy IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 52, NO. 9, SEPTEMBER 2005 535 Warped Discrete Cosine Transform-Based Noisy Speech Enhancement Joon-Hyuk Chang, Member, IEEE Abstract In

More information

Lecture 4 Biosignal Processing. Digital Signal Processing and Analysis in Biomedical Systems

Lecture 4 Biosignal Processing. Digital Signal Processing and Analysis in Biomedical Systems Lecture 4 Biosignal Processing Digital Signal Processing and Analysis in Biomedical Systems Contents - Preprocessing as first step of signal analysis - Biosignal acquisition - ADC - Filtration (linear,

More information

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008

I D I A P R E S E A R C H R E P O R T. June published in Interspeech 2008 R E S E A R C H R E P O R T I D I A P Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath

More information

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition

Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Spectral estimation using higher-lag autocorrelation coefficients with applications to speech recognition Author Shannon, Ben, Paliwal, Kuldip Published 25 Conference Title The 8th International Symposium

More information

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre

More information

Digital Audio. Lecture-6

Digital Audio. Lecture-6 Digital Audio Lecture-6 Topics today Digitization of sound PCM Lossless predictive coding 2 Sound Sound is a pressure wave, taking continuous values Increase / decrease in pressure can be measured in amplitude,

More information

Robust Voice Activity Detection Based on Discrete Wavelet. Transform

Robust Voice Activity Detection Based on Discrete Wavelet. Transform Robust Voice Activity Detection Based on Discrete Wavelet Transform Kun-Ching Wang Department of Information Technology & Communication Shin Chien University kunching@mail.kh.usc.edu.tw Abstract This paper

More information

Implementation of attractive Speech Quality for Mixed Excited Linear Prediction

Implementation of attractive Speech Quality for Mixed Excited Linear Prediction IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 2278-1676,p-ISSN: 2320-3331, Volume 9, Issue 2 Ver. I (Mar Apr. 2014), PP 07-12 Implementation of attractive Speech Quality for

More information

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm A.T. Rajamanickam, N.P.Subiramaniyam, A.Balamurugan*,

More information

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks

Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Adaptive time scale modification of speech for graceful degrading voice quality in congested networks Prof. H. Gokhan ILK Ankara University, Faculty of Engineering, Electrical&Electronics Eng. Dept 1 Contact

More information

Speech Coding in the Frequency Domain

Speech Coding in the Frequency Domain Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215 Introduction The speech production model can be used to efficiently encode speech signals.

More information

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS

WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS NORDIC ACOUSTICAL MEETING 12-14 JUNE 1996 HELSINKI WARPED FILTER DESIGN FOR THE BODY MODELING AND SOUND SYNTHESIS OF STRING INSTRUMENTS Helsinki University of Technology Laboratory of Acoustics and Audio

More information