Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB

Size: px
Start display at page:

Download "Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB"

Transcription

1 Evaluation of MELP Quality and Principles Marcus Ek Lars Pääjärvi Martin Sehlstedt Lule_a Technical University in cooperation with Ericsson Erisoft AB, T/RV 3th May 2

2 2

3 Abstract This report presents an investigation and evaluation of a Mixed Excitation Linear Prediction (MELP) speech codec, operating at 24 bps. An introduction to speech coding methodologies with emphasis on LPC-based algorithms is presented. Modifications to the codec, Texas Instruments MELP 24, have been made in order to test different parts of the algorithm and thereby find their impact on the quality of synthezised speech. Subjective listening tests show that the quantization of the fourier magnitudes, gain, and the LPC coefficients reduces the quality of speech. The quantization of the fourier magnitudes has the most impact on the output speech. Adding more bandpass filters, to enhance the frequency resolution in the mixed excitation, does not improve the speech quality while decreasing the number of filters makes the quality significantly worse. 3

4 Preface This report is the result of a project that has been conducted at the division T/RV at Ericsson Erisoft in Lule_a, during the spring 2. The project is a compulsory course at Lule_a University of Technology for the Master of Science degree in Signal Processing, at the department of Computer Science and Electrical Engineering. The aim was to evaluate the U.S. federal standard 24 bps MELP speech coder and document key elements in the codec. The first part (chapters 2 and 3) is an introduction to speech coding and is suitable reading for people without any previous knowledge of the subject. The second part (chapters 4 and 5) describes the evaluation and investigation process and the third part (chapters 6 8) presents results, methods and conclusions. Parts of the report is written with consideration to people that may want to follow up the work and therefore flags, commands etc. have been included. We would like to thank our supervisors Nicklas Sandgren and Jonas Svedberg for providing help and for the time they have taken from their own work to assist us. We also thank Stefan H_akansson for letting us use facilities and technology at the division, and of course our supervisor Johan Carlson at the University. 4

5 Contents 1 Introduction 7 2 Fundamentals of Speech Speech Production Speech Perception Speech Coding Speech Coders Waveform Coders Parametric Coders Hybrid Coders Algorithmic Methods Quantization Linear Prediction Line Spectrum Frequencies Pitch Prediction Performance Objective Evaluation Subjective Evaluation The MELP Speech Codec LPC-model Linear prediction Encoding Decoding Bandpass Filters and Mixed Excitation Aperiodic Flag Adaptive Spectral Enhancement Pulse Dispersion Filter Fourier Magnitude The MELP Codec Flowchart The Encoder The Decoder Investigation of the MELP Speech Codec Highpass Filter Adaptive Spectral Enhancement Pulse Dispersion Filter Bandpass Filters Synthesis Pulse Train Generation of MATLAB Data Quantization Bandpass Voicing Strength Gain LPC Coefficients

6 5.7.4 Pitch Fourier Magnitudes Jitter Test Methods Objective Testing Segmental SNR Subjective Testing Tests and Results Quantization Bandpass Filters Synthesis Pulse Train Jitter Comparison of Performance Conclusions and Comments 36 9 Further Investigation 37 A Appendix A 39 A.1 Useful UNIX commands A.1.1 rlogin A.1.2 projlogin A.1.3 setenv A.1.4 echo <string> A.2 Useful commands in trstud A.2.1 DATPLAY A.2.2 DATPLAYBOY A.2.3 SCREAM A.2.4 make A.2.5 TRANSF A.2.6 CORR A.2.7 SNR A.2.8 SNRSEG A.2.9 PSQM B Flowcharts 42 C Lowpass and Highpass Filters 44 D Original and Matlab Bandpass Filters 46 E Bandpass Filter Sets (Analysis-Synthesis) 49 F Speech Signal Samples 54 6

7 1 INTRODUCTION 7 1 Introduction In wireless communication systems it is desirable to compress the speech signal before transmission. That is, to reduce the amount of data that has to be sent over the channel. The phrase 'speech coding' most often refers to the technique that uses coding to compress a speech signal so that less bandwidth is needed for transmission, at the same time as the quality of the speech signal is preserved. This is an essential part in for example cellular systems, which aim to comprise as many users as possible within a limited bandwidth. The coding and decoding of the speech signal is performed by a so called "speech codec". It is of importance that the speech signal can be reconstructed with as high fidelity as possible at the receiver. One goal for this project was to document key elements in the band excitation mixing/decisions and propose improvements to the band mixing. A second goal was to evaluate the mixed excitation encoding principle by comparing the unquantized MELP to other standards. The first thing to do was to get aquainted with the field of study. That is, to learn about speech coding in general and the MELP codec in particular. It was done by reading several technical articles and reports. The article written by McCree and Barnwell III [1] was presented to the other project groups. The next stage of the project was to search the WWW for avaliable source code. The American Department of Defence (DoD) fortunately had one version of the MELP codec on their homepage, otherwise the codec would have had to be implemented. Now the task at hand was to examine the algorithm from the american DoD to identify key elements for later modification.

8 2 FUNDAMENTALS OF SPEECH 8 2 Fundamentals of Speech 2.1 Speech Production Speech sounds are formed when air from the lungs pass through the vocal tract, see Fig. 1. The vocal tract can be modelled as a spectral shaping filter, and the frequency response depends on the size and shape of it as described by Abrahamsson in [8]. The shape can be altered using the tongue and the mouth. Speech signals Figure 1: The vocal tract [8]. can be partitioned into two main groups, voiced and unvoiced speech. Voiced speech is produced when the vocal cords are closed. Air from the lungs build up a preassure behind the cords, in the beginning of the vocal tract, see Fig. 1. When the preassure becomes high enough the cords are forced open. As a consequence the vocal cords begin to vibrate harmonically. The fundamental period of the vibrations is called the pitch period and it is dependent on the tension and length of the vocal cords. The pitch istherefore different for male, female and children speakers. In general women and children have shorter cords than men and they consequently have a higher pitch frequency (higher voice). The pitch frequency is typically between 5 and 4 Hz. The effect of the opening and closing of the vocal cords is that the air passing into the rest of the vocal tract have the characteristics of a quasi-periodic pulse train, see Fig. 2a. During unvoiced speech the vocal cords are open and the air pass undisturbed into the rest of the vocal tract. Because of this there is no periodicity inunvoiced speech segments and they are very much noise-like, see Fig. 3a. Unvoiced speech can therefore be approximated by white random noise. After the vocal cords the speech signal consists of either a series of pulses or random noise, depending on whether the speech isvoiced or unvoiced. The signal is passed into the rest of the vocal tract which forces its own frequency response on the signal. This can be seen as the envelope of the power spectrum, see Fig. 2b and 3b. The peaks in the envelope of the spectrum are called formants and are located at the pitch frequency.

9 2 FUNDAMENTALS OF SPEECH 9 Figure 2: Voiced speech a) time domain, b) frequency domain. Figure 3: Unvoiced speech a) time domain, b) frequency domain. 2.2 Speech Perception There is a limit to the sensitivity of the ear. An audible signal can become inaudible if there is a stronger signal present at a frequency near the original signal. This phenomenon is called masking and is often used in speech coding to avoid spending coding effort on signal components that the ear anyway can not hear. Masking is often incorporated as a part of the quantization process. There will always be a difference between the input and output signal of the codec, since information of the speech signal is lost in the coding process. The difference between the input and output signal of the codec is called the coding noise and is to a great extent due to quantization.

10 2 FUNDAMENTALS OF SPEECH 1 A larger coding noise power can be tolerated near the formant frequencies where the masking is more efficient and less noise power can be tolerated in spectral regions where there is less energy in the speech signal, e.g. between the formants. One therefore often tries to correlate the spectrum of the coding noise with the spectrum of the original speech signal. That is, to shape the coding noise spectrum so that it is increased in the formant regions and decreased otherwise. This operation is called noise shaping. There are also other features of the codec that make use of the limitations of the human ear. The main idea is that there is no reason to code and transfer components of the signal that the human ear will filter out anyway.

11 3 SPEECH CODING 11 3 Speech Coding The purpose of speech coding is to reduce the bit rate of digital speech signals and to reconstruct the speech signal in the decoder with as high quality as possible. One way of reducing the bit rate is to remove the redundancy in the speech signal by employing the knowledge about speech production and perception mentioned earlier. This leads to models which only parameterises perceptually relevant information. A speech coder always consists of a COder and a DECoder (a so called codec). The encoder takes the original speech signal and produces a bit stream. The bit stream passes some sort of channel on its way to the decoder. The decoder constructs an approximation of the original signal, often refered to as the reconstructed signal or the syntisised speech. Most speech coders use a bandwidth of 3.2 khz (2 Hz 3.4 khz) and a sampling frequency of 8 khz. 3.1 Speech Coders There exist three main types of speech coders, waveform coders, parametric coders and hybrid coders. There is no fixed boundary between the first two types. Some algorithms uses properties from both waveform and parametric coders and they are called hybrid coders, which can be considered as a third type. The three types of coders differ in the bit rate and the quality of reconstructed speech. Waveform coders are mainly used for high bit rates, more than 16 kbit/s, parametric coders are used for low bit rates (less than 4 kbps), and hybrid coders (4 16 kbps) are used at intermediate bit rates, see Fig. 4. Figure 4: Quality vs. Bit Rate Waveform Coders Waveform coders are a class of coders that attempts to reproduce the input signal's waveform. That is, to make the reconstructed signal resemble the original

12 3 SPEECH CODING 12 Figure 5: Vocoder Speech Production Model [8]. signal. In these coders, the objective is to minimize a criterion which measures the difference between the input and the output signal. A criterion often used is the mean square difference. Waveform coders provide very high quality speech at high bit rates but cannot be used to code speech atvery low bit rates Parametric Coders In parametric coding, the signal is represented by a set of model parameters. These parameters are quantized and transfered without consideration to the original signal. Periodic update of the model parameters requires fewer bits then a direct representation of the speech signal and consequently parametric coders can operate at low bit rates. However, the resulting speech tend to sound synthetic. When coding speech signals one tries to base the model parameters on the physiological structure of the human speech-production system. This means building structures that try to imitate the vocal tract and the vocal cords, [8]. An example of a parametric coder is the Vocoder (VOice CODER). The speech production model used by the vocoder speech production is shown in Fig. 5. If the speech is voiced the model excites a linear system by a series of periodic pulses. If the sound is unvoiced the model produces noise. Parametric coders attempt to produce a signal that sounds like the original speech, whether or not the time waveform resembles the original. The speech is analysed at the transmitter to determine the model parameters. This information is then quantized and transmitted to the receiver where the speech is reconstructed Hybrid Coders The idea of a hybrid coder is to combine these two techniques to achieve high a quality speech coder operating at intermediate bit rates. The most successfully and commonly used hybrid coders are time domain Analysisby-Synthesis coders. As the name implies, the encoder analyses the input speech by synthesising many different approximations to it. These coders split the input speech signal into frames (typically 2 ms long) and for each frame, parameters for

13 3 SPEECH CODING 13 a synthesis filter are derived. The parameters are chosen by finding the excitation signal to this filter that minimizes a weighted error between the input speech and the reconstructed speech. The synthesis filter parameters and the excitation are transmitted to the decoder which passes the excitation signal through the filter to give the reconstructed speech. The hybrid codeing techniques is used in the Codebook Excited Linear Prediction (CELP) coder. There are for example CELP coders operating at 4.75, 6.7 and 12.2 kbps. 3.2 Algorithmic Methods Quantization Quantization is the conversion of an analogue signal to a digital representation of the signal. Information will be lost during quantization, since the actual amplitude values can not be retained after quantization. The error introduced by the quantizer is called the quantization noise. Amplitude quantization is important in communication systems since it determines to a great extent the overall distortion and the bit rate needed. There are several quantization techniques often used, such as uniform quantization, logarithmic quantization, non-uniform quantization and vector quantization. In vector quantizers the data is quantized N samples at a time. Each block of samples is treated as an N-dimensional vector and is quantized to predetermined points in the N-dimensional space. The predetermined points are usually stored in a table called code book. Vector quantization is more sensitive to transmission errors and usually involves a greater computational complexity than scalar quantization, but always gives better performance Linear Prediction Linear prediction analysis of speech is historically one of the most important speech analysis techniques. For digital speech signals adjacent samples are often highly correlated. The purpose with linear prediction is to remove this redundant information from the speech signal and thereby reducing the number of bits needed to represent the signal. This is performed with a Linear Prediction (LP) filter that predicts the current speech sample from N earlier samples, where N is typically around ten. The linear prediction filter can be viewed as a model for the tongue and vocal tract in the human speech organs. The linear predicition filter is then usually called the synthesis filter. The lungs and vocal cords are simulated with a periodic pulse train for voiced speech, while noise is used for unvoiced speech. The speech is feed through the inverted prediction filter and the result is called the residual. This is the optimal input to the synthesis filter to get the original speech signal as output. In Appendix F Fig. 24 and 25 shows the speech signal, residual, the synthesised residual and the synthesised speech for voiced and unvoiced speech.

14 3 SPEECH CODING Line Spectrum Frequencies Quantization of the LP coefficients is problematic since small changes in the coefficient values may result in large changes in the spectrum and in unstable LP-filters. Quantization is therefore usually performed on transformations of the LP coefficients. The most commonly used representation is the Line Spectrum Frequencies (LSF). To obtain the LSFs, two polynomials are defined and F1 (z) =A (z)+z (N +1)+A (z 1) (1) 2 (z) =A (z) z (N +1)+A (z 1) (2) where N is the number of LPC coefficients. When these two polynomials are added together they result in A(z). The roots of the polynomials are situated on the unit circle and the LSF coefficients are obtained as the angle for each root. The LSF representation is a frequency-domain representation, and it has a number of properties, such as bounded range and sequential ordering, which makes it desirable for quantization of LP coefficients Pitch Prediction The speech signal is periodic during voiced speech segments due to the vibrations of the vocal cords. The short-term linear prediction is usually based on correlations over intervalls less than 2 ms. The pitch period however, have atypical range from 2 to 2 ms. Short-term linear prediction can therefore not account for the pitch. To be able to model the long-term periodicity of the speech signal a separate pitch prediction has to be made. It can either be applied to the input speech signal, before the linear prediction or after the LP-filter to the residual signal. 3.3 Performance To be able to evaluate and compare different speech coders some sort of measurements must be adopted. First we need to know what properties speech coders are evaluated against and then we need some sort of method that can provide us with a value of how well the algorithm codes the speech. Speech coders are often developed with a particular application in mind, and therefore the properties can be optimised for that application. Speech coding algorithms are often evaluated based on the properties listed below. ffl Bit rate. The bit rate specifies the number of bits needed to represent one second of speech. The preliminary goal with speech coding is to reduce the bit rate. Coders intended for use in the general telephone networks today have bit rates of 16 kbit/s or higher. The bit rate can be either fixed or variable. In variable bit rate speech coders differentnumber of bits are used depending on the characteristics of the speech signal. For example, a signal segment which contains silence does not require as many bits as a signal segment containing

15 3 SPEECH CODING 15 speech. Higher bit rates are suitable for environments where high speech quality is requested. Low bit rate vocoders are usually used by the military and in satellite links, where intelligiblity and low power is more important than quality. The relationship high bit rate! high quality is not necessarily true, but higher bit rate vocoders usually produces a higher speech quality. ffl Quality of reconstructed speech. When determining the speech quality a good objective criterion does not exist for low bit rate codecs, so the decision has to be made subjectively. When testing if the coder meets the requirements of speech quality extensive subjective tests with human listeners have to be made. The quality of reconstructed speech can be measured with simple methods like SNR, SEGSNR or with more complex methods that tries to simulate the human ear. The most important measure is still listening tests performed on humans. ffl Complexity of the algorithm. Like all computer based algorithms a high complexity normally introduces implementation errors and delays. The hardware and software tends to be expensive when time- and memory demands are high and the algorithms tends to be unstable. ffl Robustness to channel errors and acoustic interference. If the coding algorithm has some sort of error protection one could transfer speech over bad" channels. The error protection introduces extra bits for every speech frame transferred. It is therefore important not to exaggerate the protection. ffl Time delay introduced. The delay is of importance mainly for two-way communications. The delay should not affect the dynamics of the conversation. Codec delay can be divided into four parts, algorithmic delay, computational delay, multiplexing delay, and transmission delay. A rough estimate of the overall one-way delay introduced by the speech coder can be made based only on the frame size. The algorithmic delay is about one frame, and the computational delay is also about one frame. One can often assume that the multiplexing and transmission delay add up to one frame length if the channel is not shared. Consequently, without look-ahead, most speech coders give at least three frames of one way delay. A frame is typically between 1 and 2 ms each, which gives a total delay between 3 and 6 ms. One cannot make any general rank of the above statements. The area of use is very important and must be mapped to the statements in order to make a fair evaluation.

16 3 SPEECH CODING Objective Evaluation Objective evaluation methods are often sensative to both gain and delay variations. The methods are well defined and based on statistics and mathematics. It is very easy to compare two different solutions with calculated measurement quality. Common methods used in the evaluation work of speech coders are SNR, SEGSNR, articulation index, the log spectral distance as well as the Euclidean distance. The signal-to-noise ratio (SNR) is one of the most common measures for objective evaluation of the performance of compression algorithms. The SNR is calculated according to equation: SNR =1 log M 1 P M 1 P n= s2 (n) (s (n) ~s (in + n))2 n= The SNR tends to hide temporal reconstruction noise for low level signals due to its long term measuring. To avoid this a segmental SNR can be used. The segmental SNR (SEGSNR) is constructed to expose weak signal performance. This evaluation methode uses the common SNR function evaluated for each N - point segment of the speech. The SEGSNR is calculated according to the equation: SEGSNR = 1 L 1 X log1 N i= M 1 P M 1 P n= s2 (in + n) (s (in + n) ~s (in + n))2 n= This ensures that SEGSNR penalizes coders whose performance is variant. The disadvantages with this kind of objective quality measurements is the lack of knowledge about the quality of reproduced speech in a human perspective. The human ear does not have to agree with the objective analysis that are made, and seldom does. For this purpose a second evaluation method is defined, the subjective evaluation Subjective Evaluation Three subjective measures often used are: the diagnostic rhyme test (DRT), the diagnostic acceptability test (DAM), and the mean option score (MOS), all based on human rating. The DRT measures intelligibility, DAM provides a characterisation of reconstructed speech in terms of a broad range of distortions and MOS attempts to combine all aspects of quality in a single number. The MOS is the most commonly used measure for subjective quality of reconstructed speech. For the purpose of making a quality test which accounts for the perceptual properties of the ear, subjective evaluation methods are used. All subjective tests are based on ranks from a number of test people and is therefore, in general, more true (3) (4)

17 3 SPEECH CODING 17 than any objective test result. The major drawback with subjective evaluation methods is that they take alot of time to set up and to do. The simplest form of listening test is the AB-test. This evaluates the ranking of the tested segments. For a set of N segments N Λ(N 1) comparisions between two segments have to be made. Two segments (A and B) are then played to a listener who selects which segment is the better. The main point in AB-test is that the listener does not know the order of the A and B segment since all these test are made in random order. As the test includes all the combinations of two segments the listener will vote for both the A-B and B-A order of each combination of two segments. The drawback is that just a few segments cause a large number ofcombinations that has to be listened to, e.g. five segments requires 2 (5 Λ 4) comparisions.

18 4 THE MELP SPEECH CODEC 18 4 The MELP Speech Codec The MELP speech codec operates at 2.4 kbit/s and is therefore classified as a very low bit rate speech codec. It is an extension of the ordinary Linear Prediction Coder (LPC) model. Both these coders are fully parametric models of the human speech organs such as lungs, vocal cords and vocal tract. This makes the low bit rate possible. The extensions to the LPC are used to compensate some of the problems in the ordinary LPC. The two largest problems are the buzzy speech quality and the presence of short isolated tones in the synthesised speech. McCree and Barnwell III [1] gives a very detailed description of how the MELP coder works. The following subsections are a short summary on the information in that report. It starts by describing the LPC-model and then the extensions used in the MELP codec are described one by one. 4.1 LPC-model Linear prediction The basis is the source-filter model where the filter is constrained to be an all-pole linear filter. This amounts to performing a linear prediction of the next sample as a weighted sum of past samples. ~s n = px a i s n i H(z) = i=1 1 1 p P i=1 a iz i Given N samples of speech, we would like to compute estimates to a i that results in the best fit. One reasonable way to define best fit" is in terms of mean squared error. These can also be regarded as most probable" parameters. If it is assumed that the distribution of errors is Gaussian and a priori there are no restrictions to the values of a i. The error e(n) is defined as: (5) e(n) =s(n) ~s n (6) The summed squared error, E, over a finite window of length N is: N 1 X E = e(n) 2 (7) n= The minimum of E occurs when the derivative is zero with respect to each of the parameters, a i. As can be seen from equation (7) the value of E is quadratic in a i and therefore there is a single solution. Very large positive ornegative values of e(n) leads to poor prediction and hence the solution to equation (7) is a minimum Encoding When coding speech with LPC it is first divided in to frames that contains between 2 and 3 ms of sampled speech data. The compromise in this case is that

19 4 THE MELP SPEECH CODEC 19 Table 1: Bit allocation for LPC-1. Voiced Unvoiced Pitch/Voicing 7 7 Gain 5 5 Sync 1 1 K(1) 5 5 K(2) 5 5 K(3) 5 5 K(4) 5 5 K(5) 4 - K(6) 4 - K(7) 4 - K(8) 4 - K(9) 3 - K(1) 2 - Total a smaller frame size would be prefered because the chances are higher that the speech signal can be assumed to be stationary. On the other hand, a smaller frame size would result in a higher bit rate. The frames are then coded one by one by first searcing for the stongest correlation for frequency between 5-2 Hz. This frequency is called the pitch. It is used to determine if the frame is voiced or unvoiced. If the frame is voiced the normalised correlation between x(k) and x(k pitch) will be close to 1.. If this is not the case the frame is unvoiced. Now the samples in the frame can be used to estimate the filter coefficients for the synthesis filter. Finally the energy level for the frame is calculated to be able to reconstruct the speech with the correct signal level. The information on voiced/unvoiced, synthesis filter coefficients, pitch and energy is then transmitted to the decoder. Before tranmission these values have to be quantized and this imposes limits on the achivable speech quality. An example of the bit allocation for a LPC en coder can bee seen in table 1, from [6] Decoding In the decoder the synthesised speech is built one pitch period at a time. This allows for the parameters for the synthesis filter, the pitch period and the gain to be linearly interpolated through the frame. This is done to avoid the spikes that would arise when the filter parameters are changed. If the frame is unvoiced the decoder uses white noise as input to the synthesis filter otherwise it uses a periodic pulse train with the same frequency as the

20 4 THE MELP SPEECH CODEC 2 pitch. The output from the filter is then amplified to get the same energy level as specified by the coder. The synthesised speech is now finished. 4.2 Bandpass Filters and Mixed Excitation The synthesised speech for LPC has a strong buzzy quality. The reason for this is that the residual is replaced with a pulse train that contains higher frequencies than the original residual, this causes the synthesised speech to contain overtones that were not present in the original speech. To avoid this each frame of speech is first bandpass filtered to a number of frequency bands (4 1) and for each band a voice strength is calculated. The voice strength can be calculated in two ways. In the first the correlation for the bandpass filtered signal is calculated around the pitch lag. At high frequencies this method is sensitive to variations in the pitch period. The second method calculates the correlation around the pitch lag for the envelope of the bandpass filtered signal. This makes it unsensitive to pitch variations, since the peeks in the envelope are much wider. The voice strength is selected to be the larger of the two calculated values. The value can then be quantized to one bit using a threshold. This bit signals voiced or unvoiced speech. Compared to the original LPC, one extra bit is needed for each bandpass filter. In the decoder the same number of bandpass filters are feed with a periodic pulse train or white noise depending on if they were voiced or unvoiced. The sum of the signals after the bandpass filter is then feed to the synthesis filter. In the examined MELP coder the bandpass filters were built using 6:th order poles/zeros butterworth filters. In the decoder the bandpass filters were implemented as 32:nd order FIR filters. This makes it possible to use only two filters for the calcualtion of the synthesised residual. One for the pulse train and one for the noise. Each of these two filters is constructed using a weighted sum of the bandpass filter for each frequency band. The weighting factor is the voice strength. The speech signal was also filtered with the inverse synthesis filter and thereby obtaning the residual. This is what would be the optimal pulse train for the decoder. The peakiness for the residual was then calculated and depending on the value, some of the lowest frequency bandpass filters voicing strength can be forced to Aperiodic Flag The synthesised speech from an LPC can sometimes contain short isolated tones, this is especially true for female speakers. There are no known reasons for this behaviour, but the current theory is that the synthesis filter is almost unstable. To cure this a third voice state, jittery voiced, is introduced in the encoder and the flag called aperiodic pulses is used to transfer this information. If this flag is set the decoder destroys the periodicity in the periodic pulse train by

21 4 THE MELP SPEECH CODEC 21 varying each pitch period with a uniformly distributed position jitter. This also relaxes the demands for the voiced versus unvoiced decision, since it is possible to call it voiced and aperiodic if we are somewhere between voiced and unvoiced. In the case of the MELP from the US DoD the position had a rectangular distribution of +/- 2 % of the current pitch period. 4.4 Adaptive Spectral Enhancement Two problems exist with the construction of the synthesis filter. First the formant resonances are sometimes hard to create without making the filter unstable. The second is that the formant bandwidths might vary within the frame, and thereby making it hard to create a matching filter. One solution would be to sharpen the filter, that is to move the poles closer to the unit circle, but this can make the filter unstable. A better solution is to insert an adaptive pole/zero filter and a simple FIR filter where the poles are a scaled version of the poles in the synthesis filter. The zeros and the FIR filter are used to compensate for the low pass effect off the pole filter. The combined effect of these filters in the time domain is the same as if the synthesis filter had been sharpened for the first half of the frame and weakened during the second half. This effect limits the effect of the almost unstable filter in the first half of the frame. 4.5 Pulse Dispersion Filter The problem is that the frequencies decay too fast between the formants, especially in the high frequency area. The solution is a pulse dispersion filter that is created from the frequency response of a fixed triangle pulse but with its low pass characteristics removed. To create the FIR filter coefficients the discrete fourier transform (DFT) is used to calculate the frequency response of the triangle pulse. The magnitude is then set to unity and the inverse DFT is used to get the final coefficients. 4.6 Fourier Magnitude The residual is the optimal pulse train. What one would like to do is to transmit the residual to the decoder. This would take to many bits so a compromise is to send the ten lowest fourier magnitudes calculated on the residual. This makes it possible to use a pulse train with nonuniform magnitudes. To calculate these fourier magnitudes one pitch period is zeropadded to 512 samples and the FFT is calculated. The fourier magnitudes are then quantized and transmitted to the decoder. In the decoder, when the pulse train is created, each pitch period is built from the inverse discrete fourier transform calculated from the interpolated fourier magnitudes. Since only the magnitudes are transmitted there is no phase information.

22 4 THE MELP SPEECH CODEC The MELP Codec Flowchart Based on the report MELP: The New Federal Standard at 24 bps" [3] a flowchart (Fig. 7 and Fig. 8 in Appendix B) was drawn for a MELP vocoder. The main reason with this flowchart is to get a graphical representation of the vocoder structure. Both encoder and decoder (often named analysis - synthesis) are covered in the text below The Encoder In the encoder the natural speech is analysed to achieve a parametric model of it. To achieve a low bitrate much of the original information must be eliminated. It is therefore very important that redundancy is removed and that the model parameters contain as much information per bit as possible. The analysis stage can be viewed as different phases, some must be in order other does not. 1. Eliminate low frequency noise The first step in the process is to filter out low frequency noise with a 4:th order Cherbychev high pass filter with a 6 Hz cutoff frequency and a stopband rejection of 3dB, see Fig. 9 on page Pitch estimate The output of the Cherbychev filter (referred to as the input speech) is filtered using a 6th order Butterworth filter with cutoff frequency 1 khz, see Fig. 1 on page 45. The result of this operation is used to perform an initial pitch search for a pitch estimate. The pitch estimate is based on the pitch lag between 4 and 16 samples that maximizes the normalized autocorrelation function. 3. Bandpass voicing analysis The input speech is filtered by five 6th order Butterworth bandpass filters with passbands -5, 5-1k, 1k-2k, 3k-4kHz in order to split the signal into five frequency bands. The output from the lowest band (-5Hz) is used to make afractional pitch analysis. A bandpass voicing strength analysis is made on the lowest band based on the normalised correlation corresponding to the fractional pitch. If the bandpass voicing strength is less than.5 an aperiodic flag is set to one, otherwise to zero. A bandpass voicing strength analysis is then made for the higher frequency bands. The analysis is based on a normalised correlation corresponding to the previously calculated fractional pitch, the bandpass signal and the time envelope of the later. In order to compensate for bias error the time envelope is decremented by.1 before any calculations are done. 4. Linear prediction analysis A 1th order linear prediction analysis, Levinson-Durbin recursion, is made on the input speech using a hamming window (2 points, 25 ms) centered over the last sample of the current frame. The resulting linear prediction

23 4 THE MELP SPEECH CODEC 23 Table 2: Bit allocation for MELP. Voiced Unvoiced LSF Fourier magnitudes 8 - Gain 8 8 Pitch/Voicing 7 7 Bandpass vocing 4 - Aperiodic flag 1 - Error correction - 13 Sync bit 1 1 Total coefficients a i are multiplied with the bandwidth expansion coefficient :994 i (15.3 HZ). 5. LPC residual signal By using a linear prediction filter with the filter coefficients a i the input speech is filtered in order to get a residual signal. 6. Peekiness value A peekniess value is calculated over 16 samples of the residual. If the peekniess exceeds 1:34 the lowest band voicing strength is forced to the value 1:. If the value exceeds 1:6 the bandpass voicing strengths of the three lowest bands (recall: -5, 5-1k, 1k-2kHz) are forced to the value 1:. 7. Final Pitch Estimate A final pitch estimate is calculated by using the residual signal. An integer pitch search is performed over lags 1 samples wider than the fractional pitch. A fractional pitch refinement is then made around the integer pitch lag. When the resulting fractional pitch exceeds, or is equal to,.6 a pitch doubling procedure is performed. 8. Gain Estimation Next step is to estimate the gain. Input gain is measured twice per frame using a pitch adaptive window length. The estimated gain is the RMS value of the input signal over the window. 9. Quantization of Parameters The LPC coefficients, pitch, gain and bandpass voicing are quantized. The fourier magnitudes of the residual signal are also calculated and then quantized. A spectral peak-picking algorithm is used to find the harmonics in each frame. If a frame is unvoiced the bits are protected using Hamming codes otherwise the bits are simply packed into the bitstream. The bit allocation can be seen in table 2.

24 4 THE MELP SPEECH CODEC The Decoder The decoder is supposed to take the bitstream, unpack it and produce high quality synthesised speech. The pitch is decoded first, since it contains the mode (voiced or unvoiced) of information. 1. Pitch decoding The decoded pitch contains the information about the mode of the current frame. If the pitch is all zero or has only one bit not equal to zero the mode is forced to be unvoiced and an error correction is made. If exactly two bits are not equal to zero in the pitch code then a frame erasure is indicated and a frame repeate function is implemented. Any other occurance of bits equal to one indicates that the mode is voiced and the parameters can be directly decoded. 2. Noise and gain estimation Next the decoder updates the noise estimator and an attenuation is applied. However, noise estimation and gain attenuation are disabled for repeated frames. 3. Interpolation of synthesis parameters The LSF's, logarithmic speech gain, pitch, jitter, Fourier magnitudes, pulse and noise coefficients for the mixed excitation and the spectral tilt 1 coefficient are interpolated lineary against their previous value (for the last frame). If there is a difference greater than 6dB in gain or if there is an offset with high pitch frequency between two frames the interpolation is not made. 4. Generate the mixed excitation The excitation is generated as the sum of the filtered pulse and noise excitation. The noise is generated by a uniform random number generator and then normalised. The excitations (noise and pulse) are then added together. 5. Adaptive spectral enhancement filter The 1th order pole-zero filter (with an additional 1st order spectral tilt compensation) is applied to the mixed excitation. Its coefficients are calculated by bandwidth expansion on the interpolated LPC filter coefficients and adapted based on the SNR. 6. LPC and gain synthesis The LPC filter coefficients are based on the interpolated LSF coefficients. A gain scaling factor is applied after the filter and is based on lineary interpolated values. 7. Pulse dispersion filter A pulse dispersion filter is applied after the gain scaling. The filter is a 65th order FIR filter derived from a spectrally flattened triangular pulse. 8. Buffering Since the synthesiser produces a full period of synthesised speech at a time, some buffering must finally be made. 1 Used for the adaptive spectral enhancement filter

25 5 INVESTIGATION OF THE MELP SPEECH CODEC 25 5 Investigation of the MELP Speech Codec First the code was examined to find where and how different functions were implemented. Some initial tests were made by turning off different functions in the coder. This was also a way to get familiar with the working environment at Erisoft. It got clear that it was easy to make errors and that it was a very slow way to work since the source code had to be changed and compiled for each test. This is a major drawback if one wants to make the same tests over again. The solution is to use command line switches to turn off different functions. To be able to use command line switches the C-source code was changed to compile under C++. The original command line parsing could then be replaced by the Argum class 2 from BC-Lib 3 and flags in the code. After this change the different test can be executed by running the coder with different command line switches. At the same time the possibility to only generate the coded bitstream was removed. The modified program always runs analysis followed by synthesis and requires both an input and an output file. These files are both in the RAW format. The following subsections each describes the changes made on the specific function and the reason for doing them. 5.1 Highpass Filter The MELP codec uses a 4th order highpass filter, see Fig. 9 on page 44, to remove any dc-component in the input signal. The type of filter is Chebychev Type II and the cutoff frequency is 6 Hz with a stop band attenuation of 3 db. In changing the code the boolean varaible uhp was introduced to make it possible to turn the highpass filter of. This is done by specifying the command line switch -nohp. 5.2 Adaptive Spectral Enhancement The MELP codec specifies the use of Adaptive Spectral Enhancement which can be turned off by using the command line switch -noase. This sets the boolean variable uase to false. 5.3 Pulse Dispersion Filter The MELP codec specifies the use of a pulse dispersion filter which can be turned off by using the command line switch -nopdf. This sets the boolean variable updf to false. 5.4 Bandpass Filters In the original MELP codec there were 5 bandpass filters. The cutting frequencys were -1/8, 1/8-1/4, 1/4-1/2, 1/2-3/4 and 3/4-1 in normalised frequencies. 2 An class to handle command line parsing and command line switch extraction 3 Code liberary used Erisoft for developement of speech and channel coders. Built in C++ with extensions for floating matrixes and vectors

26 5 INVESTIGATION OF THE MELP SPEECH CODEC 26 1 Frequency response for original Encoder BandPass Filter 1 H(w) db bp1 5 bp2 bp3 bp4 bp w Figure 6: Original bandpass filters. The analysis filtes are 6:th order Butterworth pole/zero filters and the synthesis filters are 32:nd order FIR filters. Fig. 6 shows the frequency response of the original bandpass filters. The first task was to design the same filters in MATLAB to see if it was possible. For the analysis filters the MATLAB function butter generated almost the same pole/zero location as the original. The synthesis filter were a little more tricky. The description specifies a 32:nd order FIR filter windowed with a Hamming window. This did not work, but by trial and error it was found that the MATLAB function fir1 windowed with the square root of the MATLAB window function hamming gave an almost perfect match. The MATLAB function filterset was constructed to create the nessesary filter parameters for both analysis and synthesis for an input vector with cutting frequencies. In Appendix D the result from MELPcomp.m can be viewed. The figures shows the comparison between the original filters and the filters constructed with MAT- LAB. They are almost identical, so it was possible to create the nessesary filters in matlab. To test how the bandpass filters affected the quality a set of different bandpass filters were created with the MATLAB function firs. This MATLAB routine generates the c-source files filter.c and filter.h that contain all the filter parameters for the different filter sets. The MELP source code was then changed to include the integer nbpf that specifies what set of filters should be used. This interger can be set by the command line switch -newbpf=n where n is a number in the range [...7]. If the switch is not present nbpf is set to -1 and the original filters are used. The implemented filters are presented in the table Synthesis Pulse Train In the original MELP codec the pulse train is generated by the inverse DFT of the fourier magnitudes. One idea was to make the perfect pulse train by the use of

27 5 INVESTIGATION OF THE MELP SPEECH CODEC 27 Table 3: Implemented FIR filter sets Set Cutting frequencies (normalised) 1/2 1 1/4, 1/2 2 1/8, 1/4, 1/2 3 1/8, 1/4, 1/2, 3/4 4 1/16, 1/8, 1/4, 1/2 5 1/16, 1/8, 1/4, 1/2, 3/4 6 1/8, 1/4, 3/8, 1/2 7 1/8, 1/4, 3/8, 1/2, 3/4 Table 4: Command line switches for MATLAB generator Switch Action -P Turn on all printing -sp Print Speech -rp Print residual -pp Print pulse train -ep Print lpc excitation -sp Print synthesised speech -nfp=n Specify number of frames to print (default 45) -mfile=<name> Specify filenme for MATLAB data the original residual. After some tests the integer tpt was introduced to make it possible to select between several different types of pulse trains. If the command line switch -fpt=n is not present on the command line then the original MELP pulse train is used. By including -fpt= on the command line the program will use a simple form of pulse train. This pulse train only consists of a 1 at the beginning of each pitch period. By including -fpt=1 the residual in the coder is used as a pulse train in the decoder. This type of codec does not use the mixed excitation. 5.6 Generation of MATLAB Data The MELP coder was also modified to be able to generate MATLAB data for plots and analysis. The command line switches in table 4 were defined. The MATLAB data is printed frame by frame to the data file in binary format. To read data in to MATLAB the following type of program has to be used. FRAME = 18; fid = fopen(<data file>,'rb'); clear X; [X c] = fread(fid,[frame,inf],'float'); fid = fclose(fid);

28 5 INVESTIGATION OF THE MELP SPEECH CODEC 28 col = c/frame; speech = reshape(x(:,1:5:col),col/5*frame,1); residual = reshape(x(:,2:5:col),col/5*frame,1); pulse = reshape(x(:,3:5:col),col/5*frame,1); exc = reshape(x(:,4:5:col),col/5*frame,1); syn = reshape(x(:,5:5:col),col/5*frame,1); MATLAB now contains the vector variables speech, residual, pulse, exc and syn. These can then be used as usual in MATLAB. 5.7 Quantization Initial test showed a large difference between the original speech and the synthesised speech. The difference between two versions of the synthesised speech were very small 4. One idea was that the quantization imposed a limit to the performance of the coder. So to make an optimal MELP-coder all quantizations were removed. The results from this optimal coder could then be used as a reference for deciding the importance of the other functions. The parameters that are quantized are: bandpass voicing strength, jitter, gain, lpc coefficients, pitch, and fourier magnitudes. Following subsections describe the quantization of these parameters Bandpass Voicing Strength In the original MELP codec the calculated bandpass voicing strength for each passband is quantized to one of two levels. or 1., that is voiced or unvoiced. To make it possible to disable this quantization the boolean flag bq was introduced in the code. This flag is in normal case set to emphtrue, but by passing the command line switch -nobq this flag is set to false. The later disables the quantization of bandpass voicing strength coefficients. Test showed that the voicing strength could be calculated to values in the range and it was therefore not possible to directly calcualte the noise strength. Since the energy should be 1 it seemed to be a god guess that the noice strength should be calcualted as Ns =(1 Vs) (8) This is only possible if the voicing strength coefficient is limited to Now the pulse train excitation filter had to be calculated as the sum of weighted bandpass filters. NX p pbp F (k) = Vs i Λ BPF i (k) (9) i=1 and the noice excitation filter is calcualted as NX p nbp F (k) = Ns i Λ BPF i (k) (1) i=1 4 i.e. to different quantizations turned off

Chapter IV THEORY OF CELP CODING

Chapter IV THEORY OF CELP CODING Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,

More information

DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD

DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD NOT MEASUREMENT SENSITIVE 20 December 1999 DEPARTMENT OF DEFENSE TELECOMMUNICATIONS SYSTEMS STANDARD ANALOG-TO-DIGITAL CONVERSION OF VOICE BY 2,400 BIT/SECOND MIXED EXCITATION LINEAR PREDICTION (MELP)

More information

Page 0 of 23. MELP Vocoder

Page 0 of 23. MELP Vocoder Page 0 of 23 MELP Vocoder Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/

More information

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals 16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract

More information

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile

techniques are means of reducing the bandwidth needed to represent the human voice. In mobile 8 2. LITERATURE SURVEY The available radio spectrum for the wireless radio communication is very limited hence to accommodate maximum number of users the speech is compressed. The speech compression techniques

More information

Communications Theory and Engineering

Communications Theory and Engineering Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation

More information

Overview of Code Excited Linear Predictive Coder

Overview of Code Excited Linear Predictive Coder Overview of Code Excited Linear Predictive Coder Minal Mulye 1, Sonal Jagtap 2 1 PG Student, 2 Assistant Professor, Department of E&TC, Smt. Kashibai Navale College of Engg, Pune, India Abstract Advances

More information

Analysis/synthesis coding

Analysis/synthesis coding TSBK06 speech coding p.1/32 Analysis/synthesis coding Many speech coders are based on a principle called analysis/synthesis coding. Instead of coding a waveform, as is normally done in general audio coders

More information

Speech Synthesis using Mel-Cepstral Coefficient Feature

Speech Synthesis using Mel-Cepstral Coefficient Feature Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract

More information

Digital Speech Processing and Coding

Digital Speech Processing and Coding ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/

More information

Comparison of CELP speech coder with a wavelet method

Comparison of CELP speech coder with a wavelet method University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2006 Comparison of CELP speech coder with a wavelet method Sriram Nagaswamy University of Kentucky, sriramn@gmail.com

More information

Speech Compression Using Voice Excited Linear Predictive Coding

Speech Compression Using Voice Excited Linear Predictive Coding Speech Compression Using Voice Excited Linear Predictive Coding Ms.Tosha Sen, Ms.Kruti Jay Pancholi PG Student, Asst. Professor, L J I E T, Ahmedabad Abstract : The aim of the thesis is design good quality

More information

Voice Excited Lpc for Speech Compression by V/Uv Classification

Voice Excited Lpc for Speech Compression by V/Uv Classification IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 3, Ver. II (May. -Jun. 2016), PP 65-69 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Voice Excited Lpc for Speech

More information

Speech Synthesis; Pitch Detection and Vocoders

Speech Synthesis; Pitch Detection and Vocoders Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech

More information

A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS

A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS A 600 BPS MELP VOCODER FOR USE ON HF CHANNELS Mark W. Chamberlain Harris Corporation, RF Communications Division 1680 University Avenue Rochester, New York 14610 ABSTRACT The U.S. government has developed

More information

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21

E : Lecture 8 Source-Filter Processing. E : Lecture 8 Source-Filter Processing / 21 E85.267: Lecture 8 Source-Filter Processing E85.267: Lecture 8 Source-Filter Processing 21-4-1 1 / 21 Source-filter analysis/synthesis n f Spectral envelope Spectral envelope Analysis Source signal n 1

More information

The Channel Vocoder (analyzer):

The Channel Vocoder (analyzer): Vocoders 1 The Channel Vocoder (analyzer): The channel vocoder employs a bank of bandpass filters, Each having a bandwidth between 100 Hz and 300 Hz. Typically, 16-20 linear phase FIR filter are used.

More information

Mel Spectrum Analysis of Speech Recognition using Single Microphone

Mel Spectrum Analysis of Speech Recognition using Single Microphone International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree

More information

Implementation of attractive Speech Quality for Mixed Excited Linear Prediction

Implementation of attractive Speech Quality for Mixed Excited Linear Prediction IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 2278-1676,p-ISSN: 2320-3331, Volume 9, Issue 2 Ver. I (Mar Apr. 2014), PP 07-12 Implementation of attractive Speech Quality for

More information

Low Bit Rate Speech Coding

Low Bit Rate Speech Coding Low Bit Rate Speech Coding Jaspreet Singh 1, Mayank Kumar 2 1 Asst. Prof.ECE, RIMT Bareilly, 2 Asst. Prof.ECE, RIMT Bareilly ABSTRACT Despite enormous advances in digital communication, the voice is still

More information

Enhanced Waveform Interpolative Coding at 4 kbps

Enhanced Waveform Interpolative Coding at 4 kbps Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression

More information

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP

Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Speech Coding Technique And Analysis Of Speech Codec Using CS-ACELP Monika S.Yadav Vidarbha Institute of Technology Rashtrasant Tukdoji Maharaj Nagpur University, Nagpur, India monika.yadav@rediffmail.com

More information

L19: Prosodic modification of speech

L19: Prosodic modification of speech L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture

More information

Speech Coding using Linear Prediction

Speech Coding using Linear Prediction Speech Coding using Linear Prediction Jesper Kjær Nielsen Aalborg University and Bang & Olufsen jkn@es.aau.dk September 10, 2015 1 Background Speech is generated when air is pushed from the lungs through

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

MASTER'S THESIS. Speech Compression and Tone Detection in a Real-Time System. Kristina Berglund. MSc Programmes in Engineering

MASTER'S THESIS. Speech Compression and Tone Detection in a Real-Time System. Kristina Berglund. MSc Programmes in Engineering 2004:003 CIV MASTER'S THESIS Speech Compression and Tone Detection in a Real-Time System Kristina Berglund MSc Programmes in Engineering Department of Computer Science and Electrical Engineering Division

More information

Adaptive Filters Application of Linear Prediction

Adaptive Filters Application of Linear Prediction Adaptive Filters Application of Linear Prediction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing

More information

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals

Vocoder (LPC) Analysis by Variation of Input Parameters and Signals ISCA Journal of Engineering Sciences ISCA J. Engineering Sci. Vocoder (LPC) Analysis by Variation of Input Parameters and Signals Abstract Gupta Rajani, Mehta Alok K. and Tiwari Vebhav Truba College of

More information

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing

More information

Different Approaches of Spectral Subtraction Method for Speech Enhancement

Different Approaches of Spectral Subtraction Method for Speech Enhancement ISSN 2249 5460 Available online at www.internationalejournals.com International ejournals International Journal of Mathematical Sciences, Technology and Humanities 95 (2013 1056 1062 Different Approaches

More information

Cellular systems & GSM Wireless Systems, a.a. 2014/2015

Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Cellular systems & GSM Wireless Systems, a.a. 2014/2015 Un. of Rome La Sapienza Chiara Petrioli Department of Computer Science University of Rome Sapienza Italy 2 Voice Coding 3 Speech signals Voice coding:

More information

10 Speech and Audio Signals

10 Speech and Audio Signals 0 Speech and Audio Signals Introduction Speech and audio signals are normally converted into PCM, which can be stored or transmitted as a PCM code, or compressed to reduce the number of bits used to code

More information

Distributed Speech Recognition Standardization Activity

Distributed Speech Recognition Standardization Activity Distributed Speech Recognition Standardization Activity Alex Sorin, Ron Hoory, Dan Chazan Telecom and Media Systems Group June 30, 2003 IBM Research Lab in Haifa Advanced Speech Enabled Services ASR App

More information

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder

Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech Coder COMPUSOFT, An international journal of advanced computer technology, 3 (3), March-204 (Volume-III, Issue-III) ISSN:2320-0790 Simulation of Conjugate Structure Algebraic Code Excited Linear Prediction Speech

More information

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley

EE 225D LECTURE ON MEDIUM AND HIGH RATE CODING. University of California Berkeley University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Medium & High Rate Coding Lecture 26

More information

Chapter 4 SPEECH ENHANCEMENT

Chapter 4 SPEECH ENHANCEMENT 44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or

More information

The quality of the transmission signal The characteristics of the transmission medium. Some type of transmission medium is required for transmission:

The quality of the transmission signal The characteristics of the transmission medium. Some type of transmission medium is required for transmission: Data Transmission The successful transmission of data depends upon two factors: The quality of the transmission signal The characteristics of the transmission medium Some type of transmission medium is

More information

Audio Signal Compression using DCT and LPC Techniques

Audio Signal Compression using DCT and LPC Techniques Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,

More information

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of

COMPRESSIVE SAMPLING OF SPEECH SIGNALS. Mona Hussein Ramadan. BS, Sebha University, Submitted to the Graduate Faculty of COMPRESSIVE SAMPLING OF SPEECH SIGNALS by Mona Hussein Ramadan BS, Sebha University, 25 Submitted to the Graduate Faculty of Swanson School of Engineering in partial fulfillment of the requirements for

More information

Wideband Speech Coding & Its Application

Wideband Speech Coding & Its Application Wideband Speech Coding & Its Application Apeksha B. landge. M.E. [student] Aditya Engineering College Beed Prof. Amir Lodhi. Guide & HOD, Aditya Engineering College Beed ABSTRACT: Increasing the bandwidth

More information

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2

Signal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2 Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter

More information

Linguistic Phonetics. Spectral Analysis

Linguistic Phonetics. Spectral Analysis 24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There

More information

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding

Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Robust Linear Prediction Analysis for Low Bit-Rate Speech Coding Nanda Prasetiyo Koestoer B. Eng (Hon) (1998) School of Microelectronic Engineering Faculty of Engineering and Information Technology Griffith

More information

APPLICATIONS OF DSP OBJECTIVES

APPLICATIONS OF DSP OBJECTIVES APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel

More information

EC 2301 Digital communication Question bank

EC 2301 Digital communication Question bank EC 2301 Digital communication Question bank UNIT I Digital communication system 2 marks 1.Draw block diagram of digital communication system. Information source and input transducer formatter Source encoder

More information

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation

Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University

More information

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005

University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis

More information

Pitch Period of Speech Signals Preface, Determination and Transformation

Pitch Period of Speech Signals Preface, Determination and Transformation Pitch Period of Speech Signals Preface, Determination and Transformation Mohammad Hossein Saeidinezhad 1, Bahareh Karamsichani 2, Ehsan Movahedi 3 1 Islamic Azad university, Najafabad Branch, Saidinezhad@yahoo.com

More information

Signal Characteristics

Signal Characteristics Data Transmission The successful transmission of data depends upon two factors:» The quality of the transmission signal» The characteristics of the transmission medium Some type of transmission medium

More information

(i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters

(i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters FIR Filter Design Chapter Intended Learning Outcomes: (i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters (ii) Ability to design linear-phase FIR filters according

More information

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC

NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC NOISE SHAPING IN AN ITU-T G.711-INTEROPERABLE EMBEDDED CODEC Jimmy Lapierre 1, Roch Lefebvre 1, Bruno Bessette 1, Vladimir Malenovsky 1, Redwan Salami 2 1 Université de Sherbrooke, Sherbrooke (Québec),

More information

Speech Enhancement using Wiener filtering

Speech Enhancement using Wiener filtering Speech Enhancement using Wiener filtering S. Chirtmay and M. Tahernezhadi Department of Electrical Engineering Northern Illinois University DeKalb, IL 60115 ABSTRACT The problem of reducing the disturbing

More information

Design of FIR Filters

Design of FIR Filters Design of FIR Filters Elena Punskaya www-sigproc.eng.cam.ac.uk/~op205 Some material adapted from courses by Prof. Simon Godsill, Dr. Arnaud Doucet, Dr. Malcolm Macleod and Prof. Peter Rayner 1 FIR as a

More information

(i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters

(i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters FIR Filter Design Chapter Intended Learning Outcomes: (i) Understanding of the characteristics of linear-phase finite impulse response (FIR) filters (ii) Ability to design linear-phase FIR filters according

More information

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder

A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder A Closed-loop Multimode Variable Bit Rate Characteristic Waveform Interpolation Coder Jing Wang, Jingg Kuang, and Shenghui Zhao Research Center of Digital Communication Technology,Department of Electronic

More information

CSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued

CSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued CSCD 433 Network Programming Fall 2016 Lecture 5 Physical Layer Continued 1 Topics Definitions Analog Transmission of Digital Data Digital Transmission of Analog Data Multiplexing 2 Different Types of

More information

Auditory modelling for speech processing in the perceptual domain

Auditory modelling for speech processing in the perceptual domain ANZIAM J. 45 (E) ppc964 C980, 2004 C964 Auditory modelling for speech processing in the perceptual domain L. Lin E. Ambikairajah W. H. Holmes (Received 8 August 2003; revised 28 January 2004) Abstract

More information

Subtractive Synthesis & Formant Synthesis

Subtractive Synthesis & Formant Synthesis Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/

More information

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping

Structure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics

More information

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

International Journal of Modern Trends in Engineering and Research   e-issn No.: , Date: 2-4 July, 2015 International Journal of Modern Trends in Engineering and Research www.ijmter.com e-issn No.:2349-9745, Date: 2-4 July, 2015 Analysis of Speech Signal Using Graphic User Interface Solly Joy 1, Savitha

More information

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

Acoustics, signals & systems for audiology. Week 4. Signals through Systems Acoustics, signals & systems for audiology Week 4 Signals through Systems Crucial ideas Any signal can be constructed as a sum of sine waves In a linear time-invariant (LTI) system, the response to a sinusoid

More information

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner. Perception of pitch AUDL4007: 11 Feb 2010. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum, 2005 Chapter 7 1 Definitions

More information

DIGITAL FILTERS. !! Finite Impulse Response (FIR) !! Infinite Impulse Response (IIR) !! Background. !! Matlab functions AGC DSP AGC DSP

DIGITAL FILTERS. !! Finite Impulse Response (FIR) !! Infinite Impulse Response (IIR) !! Background. !! Matlab functions AGC DSP AGC DSP DIGITAL FILTERS!! Finite Impulse Response (FIR)!! Infinite Impulse Response (IIR)!! Background!! Matlab functions 1!! Only the magnitude approximation problem!! Four basic types of ideal filters with magnitude

More information

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia

SILK Speech Codec. TDP 10/11 Xavier Anguera I Ciro Gracia SILK Speech Codec TDP 10/11 Xavier Anguera I Ciro Gracia SILK Codec Audio codec desenvolupat per Skype (Febrer 2009) Previament usaven el codec SVOPC (Sinusoidal Voice Over Packet Coder): LPC analysis.

More information

Universal Vocoder Using Variable Data Rate Vocoding

Universal Vocoder Using Variable Data Rate Vocoding Naval Research Laboratory Washington, DC 20375-5320 NRL/FR/5555--13-10,239 Universal Vocoder Using Variable Data Rate Vocoding David A. Heide Aaron E. Cohen Yvette T. Lee Thomas M. Moran Transmission Technology

More information

Time division multiplexing The block diagram for TDM is illustrated as shown in the figure

Time division multiplexing The block diagram for TDM is illustrated as shown in the figure CHAPTER 2 Syllabus: 1) Pulse amplitude modulation 2) TDM 3) Wave form coding techniques 4) PCM 5) Quantization noise and SNR 6) Robust quantization Pulse amplitude modulation In pulse amplitude modulation,

More information

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012 Signal segmentation and waveform characterization Biosignal processing, 5173S Autumn 01 Short-time analysis of signals Signal statistics may vary in time: nonstationary how to compute signal characterizations?

More information

General outline of HF digital radiotelephone systems

General outline of HF digital radiotelephone systems Rec. ITU-R F.111-1 1 RECOMMENDATION ITU-R F.111-1* DIGITIZED SPEECH TRANSMISSIONS FOR SYSTEMS OPERATING BELOW ABOUT 30 MHz (Question ITU-R 164/9) Rec. ITU-R F.111-1 (1994-1995) The ITU Radiocommunication

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/

More information

COMP 546, Winter 2017 lecture 20 - sound 2

COMP 546, Winter 2017 lecture 20 - sound 2 Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering

More information

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday. L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are

More information

Signal Processing Toolbox

Signal Processing Toolbox Signal Processing Toolbox Perform signal processing, analysis, and algorithm development Signal Processing Toolbox provides industry-standard algorithms for analog and digital signal processing (DSP).

More information

Data Communication. Chapter 3 Data Transmission

Data Communication. Chapter 3 Data Transmission Data Communication Chapter 3 Data Transmission ١ Terminology (1) Transmitter Receiver Medium Guided medium e.g. twisted pair, coaxial cable, optical fiber Unguided medium e.g. air, water, vacuum ٢ Terminology

More information

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Direct link. Point-to-point.

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Direct link. Point-to-point. Terminology (1) Chapter 3 Data Transmission Transmitter Receiver Medium Guided medium e.g. twisted pair, optical fiber Unguided medium e.g. air, water, vacuum Spring 2012 03-1 Spring 2012 03-2 Terminology

More information

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes

SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN Yu Wang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London,

More information

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution

Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution PAGE 433 Accurate Delay Measurement of Coded Speech Signals with Subsample Resolution Wenliang Lu, D. Sen, and Shuai Wang School of Electrical Engineering & Telecommunications University of New South Wales,

More information

Speech Signal Analysis

Speech Signal Analysis Speech Signal Analysis Hiroshi Shimodaira and Steve Renals Automatic Speech Recognition ASR Lectures 2&3 14,18 January 216 ASR Lectures 2&3 Speech Signal Analysis 1 Overview Speech Signal Analysis for

More information

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt

Pattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

More information

Synthesis Algorithms and Validation

Synthesis Algorithms and Validation Chapter 5 Synthesis Algorithms and Validation An essential step in the study of pathological voices is re-synthesis; clear and immediate evidence of the success and accuracy of modeling efforts is provided

More information

Lecture Fundamentals of Data and signals

Lecture Fundamentals of Data and signals IT-5301-3 Data Communications and Computer Networks Lecture 05-07 Fundamentals of Data and signals Lecture 05 - Roadmap Analog and Digital Data Analog Signals, Digital Signals Periodic and Aperiodic Signals

More information

Chapter 2: Signal Representation

Chapter 2: Signal Representation Chapter 2: Signal Representation Aveek Dutta Assistant Professor Department of Electrical and Computer Engineering University at Albany Spring 2018 Images and equations adopted from: Digital Communications

More information

Experiment 2 Effects of Filtering

Experiment 2 Effects of Filtering Experiment 2 Effects of Filtering INTRODUCTION This experiment demonstrates the relationship between the time and frequency domains. A basic rule of thumb is that the wider the bandwidth allowed for the

More information

Laboratory Assignment 4. Fourier Sound Synthesis

Laboratory Assignment 4. Fourier Sound Synthesis Laboratory Assignment 4 Fourier Sound Synthesis PURPOSE This lab investigates how to use a computer to evaluate the Fourier series for periodic signals and to synthesize audio signals from Fourier series

More information

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS)

AUDL GS08/GAV1 Auditory Perception. Envelope and temporal fine structure (TFS) AUDL GS08/GAV1 Auditory Perception Envelope and temporal fine structure (TFS) Envelope and TFS arise from a method of decomposing waveforms The classic decomposition of waveforms Spectral analysis... Decomposes

More information

Telecommunication Electronics

Telecommunication Electronics Politecnico di Torino ICT School Telecommunication Electronics C5 - Special A/D converters» Logarithmic conversion» Approximation, A and µ laws» Differential converters» Oversampling, noise shaping Logarithmic

More information

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues

Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues Effects of Reverberation on Pitch, Onset/Offset, and Binaural Cues DeLiang Wang Perception & Neurodynamics Lab The Ohio State University Outline of presentation Introduction Human performance Reverberation

More information

Adaptive Filters Linear Prediction

Adaptive Filters Linear Prediction Adaptive Filters Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Slide 1 Contents

More information

Analog and Telecommunication Electronics

Analog and Telecommunication Electronics Politecnico di Torino - ICT School Analog and Telecommunication Electronics D5 - Special A/D converters» Differential converters» Oversampling, noise shaping» Logarithmic conversion» Approximation, A and

More information

Chapter 3. Data Transmission

Chapter 3. Data Transmission Chapter 3 Data Transmission Reading Materials Data and Computer Communications, William Stallings Terminology (1) Transmitter Receiver Medium Guided medium (e.g. twisted pair, optical fiber) Unguided medium

More information

Outline / Wireless Networks and Applications Lecture 3: Physical Layer Signals, Modulation, Multiplexing. Cartoon View 1 A Wave of Energy

Outline / Wireless Networks and Applications Lecture 3: Physical Layer Signals, Modulation, Multiplexing. Cartoon View 1 A Wave of Energy Outline 18-452/18-750 Wireless Networks and Applications Lecture 3: Physical Layer Signals, Modulation, Multiplexing Peter Steenkiste Carnegie Mellon University Spring Semester 2017 http://www.cs.cmu.edu/~prs/wirelesss17/

More information

Fundamentals of Digital Communication

Fundamentals of Digital Communication Fundamentals of Digital Communication Network Infrastructures A.A. 2017/18 Digital communication system Analog Digital Input Signal Analog/ Digital Low Pass Filter Sampler Quantizer Source Encoder Channel

More information

EE 264 DSP Project Report

EE 264 DSP Project Report Stanford University Winter Quarter 2015 Vincent Deo EE 264 DSP Project Report Audio Compressor and De-Esser Design and Implementation on the DSP Shield Introduction Gain Manipulation - Compressors - Gates

More information

Chapter 3 Data Transmission COSC 3213 Summer 2003

Chapter 3 Data Transmission COSC 3213 Summer 2003 Chapter 3 Data Transmission COSC 3213 Summer 2003 Courtesy of Prof. Amir Asif Definitions 1. Recall that the lowest layer in OSI is the physical layer. The physical layer deals with the transfer of raw

More information

Digital Signal Processing

Digital Signal Processing Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,

More information

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter

Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Reduction of Musical Residual Noise Using Harmonic- Adapted-Median Filter Ching-Ta Lu, Kun-Fu Tseng 2, Chih-Tsung Chen 2 Department of Information Communication, Asia University, Taichung, Taiwan, ROC

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Simplex. Direct link.

Terminology (1) Chapter 3. Terminology (3) Terminology (2) Transmitter Receiver Medium. Data Transmission. Simplex. Direct link. Chapter 3 Data Transmission Terminology (1) Transmitter Receiver Medium Guided medium e.g. twisted pair, optical fiber Unguided medium e.g. air, water, vacuum Corneliu Zaharia 2 Corneliu Zaharia Terminology

More information

System analysis and signal processing

System analysis and signal processing System analysis and signal processing with emphasis on the use of MATLAB PHILIP DENBIGH University of Sussex ADDISON-WESLEY Harlow, England Reading, Massachusetts Menlow Park, California New York Don Mills,

More information

FFT 1 /n octave analysis wavelet

FFT 1 /n octave analysis wavelet 06/16 For most acoustic examinations, a simple sound level analysis is insufficient, as not only the overall sound pressure level, but also the frequency-dependent distribution of the level has a significant

More information