Speech Coding in the Frequency Domain
|
|
- Reginald Strickland
- 6 years ago
- Views:
Transcription
1 Speech Coding in the Frequency Domain Speech Processing Advanced Topics Tom Bäckström Aalto University October 215
2 Introduction The speech production model can be used to efficiently encode speech signals. Real-life signals often also contain other sounds then single source speech; Background noises in real-life envirnoments Multiple speakers Music and mixed speech/music content in entertainment broadcasts Singing voices We need a generic coding mode for non-speech signals. Audio codecs are based on frequency-domain coding good choice for generic-mode coding.
3 Introduction TCX An early approach (in AMR-WB+) to combining CELP with frequency-domain coding was called transform coded excitation (TCX) Model spectral envelope with linear prediction as in CELP. Take discrete Fourier transform of predictor residual and quantize. From a statistical point of view, this is a valid approach: The excitation is a uncorrelated signal, whereby we can transform it directly and do not need overlap between frames in the same sense as classical audio codecs. Original residual and transform domain signal have the same amount of information Critical sampling and perfect reconstruction.
4 Introduction TCX The main issue with the original approach is that application of a proper objective function is difficult. Recall that the objective function is of the form H(x ˆx) 2. If we now apply a transform F on x, such that y = Fx. The objective function is transformed to HF 1 (y ŷ) 2. The matrix HF 1 is non-trivial whereby all samples of y are multiplied with each other. Direct quantization is inefficient (=does not give best SNR). Optimal quantization would require an exhaustive search.
5 Introduction Modern TCX variants To solve this problem, modern codecs (USAC and EVS) use MDCT as a time-frequency transform. MDCT is a lapped transform, that is, it is based on overlap-add, but has critical sampling. The envelope model is still used to model the shape of the spectrum. The spectral coefficients are encoded with entropy coding. A perceptual model can be applied in the frequency domain, whereby we do not get the problems with the objective function.
6 Introduction Outlook The rest of this lecture is a brief introduction to frequency domain coding. Our objtective is to obtain a method with the following properties: Transitions between windows are perceptually smooth. Critical sampling (frequency domain representation has as many samples as the input signal). Physically well-motivated = allows efficient processing = low leakage between frequency components.
7 Introduction Time-frequency domain A time-frequency representation represents the signal as frequency bands which evolv over time.
8 Windowing and overlap-add A basic component of most speech and audio methods is segmentation of the signal into windows. Processing in fixed-length blocks allows implementation of computationally efficient methods. When the signal can be assumed stationary within a segment, it can be modelled with a single stationary model, whereby output quality of simple methods is high. Audio processing methods generally use overlapping windows to obtain a fade-in/fade-out functionality. Not-overlapping methods would suffer from discontinuities which is perceptually bad. Overlapping windows which go smoothly to zero make sure that signal is continuous also after processing, which is perceptually good. Overlapping windowing is a perceptual tool!
9 Windowing and overlap-add 1 (a) Overlapping half sine windows Magnitude.5 1 x k x k+1 Time (c) Overlapping Kaiser Bessel derived windows Magnitude.5 x k x k+1 Time
10 Windowing and overlap-add For example, half-sine windows are defined as { ( ) sin (k+.5)π 2N, for k < 2N ω k =, otherwise A window of the input signal σ k is then ξ k = ω k σ k. Subsequent windows are obtained by shifting ω k in time.
11 Windowing and overlap-add Magnitude Time Magnitude (db) Frequency bin Frequency bin Frequency bin
12 Windowing and overlap-add Reconstruction After processing, the windowing function is applied once more ω k ˆξk on the processed signal ˆξ k. This makes sure that signal goes to zero at the borders also after processing. The windows of the signal are then added together. If σl,k and σ R,k are the left part of the current window and the right part of previous window, then for perfect reconstruction. we should have σ k = σ L,k + σ R,k = ω L,k ξ L,k + ω R,k ξ R,k = ω 2 L,kσ k + ω 2 R,kσ k = (ω 2 L,k + ω 2 R,k)σ k. If ωl,k 2 + ω2 R,k = 1 then the reconstructed signal is equal to the original signal.
13 Windowing and overlap-add Squared Magnitude Squared Magnitude (b) Overlapping squared half sine windows x k x k+1 Time (d) Overlapping squared Kaiser Bessel derived windows x x k k+1 Time
14 Windowing and overlap-add 1 (a) x -1 1 (b) W R 2 x w R (c) W L 2 x w L 2-1
15 Projections and time-domain aliasing cancellation If we have full overlap between windows, then for every window we get N new samples of data but every window contains 2N samples. The objective of coding is to reduce redundancy, but we have just doubled the amount of data! Overlapping windowing does not provide critical sampling. We need some method to return to critical sampling. MDCT is based on a projection known as time-domain aliasing cancellation (TDAC), which gives a representation with critical sampling.
16 Projections and time-domain aliasing cancellation The projection used in MDCT is based on splitting the signal into a symmetric and antisymmetric part. then x R,k = P R x k = [ J I ] x k, x L,k = P L x k = [ I J ] x k, P T L x L,k + P T R x R,k = (P T L P L + P T R P R)x k = x k since P T L P L + P T R P R = I. Moreover, P T R P Rx k is symmetric and P T L P Lx k is antisymmetric.
17 Projections and time-domain aliasing cancellation 1 (a) x -1 1 (b) P R H P R x -1 1 (c) P L H P L x -1
18 Projections and time-domain aliasing cancellation If x k is of length N, then the projected signals x R,k = P R x k and x L,k = P L x k are of length N/2. Each part holds exactly half the signal. Critical sampling! We can use these projections to remove redundancy at the overlap.
19 Combination of projection and windowing Perfect reconstruction works as long as the Princen-Bradley condition holds P T L P L + P T R P R = I. If we combine projection into symmetric and antisymmetric parts as P L = P L W L and P R = P R W R, where W L and W R are the windowing functions, then Princen-Bradley still holds. This is a special case windowing cannot be combined with all projections, but for the symmetric and antisymmetric parts it does work. We can use P L and P R as a projection into two parts. Smooth transitions between windows. Critical sampling.
20 Projections and time-domain aliasing cancellation 1 (a) x -1 1 (b) W R H P R H P R W R x -1 1 (c) W L H P L H P L W L x -1
21 Time-frequency transforms We have above obtained a critically sampled representation x of the signal such that transitions between windows are smooth. Next we want to transform the representation to the frequency domain y = Dx with a transform D. For a real-valued spectral representation we could use the ordinary discrete cosine transform of type II, that is, DCT-II. A benefit of real-valued transforms are that they are simpler to process than complex values. The signal is the represented as a weighted sum of basis functions y k. The reconstruction of individual basis functions is [ ] T P x k = L D P 1 y k. R
22 Time-frequency transforms DCT-II Basis functions of DCT-II and their reconstructions. (a) (b) -1
23 Time-frequency transforms DCT-II The reconstructed basis functions have strange shapes. some have discontinuities and some have odd corners. Clearly the combination of DCT-II and TDAC does not work too well. Extensions/reconstructions of basis functions do not corresponds to physical frequency-elements. Let s try DCT-IV instead!
24 Time-frequency transforms DCT-IV Basis functions of DCT-IV and their (windowed) reconstructions. (a) (b) -1 1 (c) -1
25 Time-frequency transforms MDCT The extended/reconstructed basis functions have the following properties The left and right parts are symmetric and antisymmetric Each extended basis function corresponds perfectly to a sinusoid. The extension is actually a discrete cosine transform, which we wall call the modified discrete cosine transform (MDCT) X k = 2N 1 n= [ ( π x n cos n + 1 N 2 + N ) ( k + 1 )]. 2 2 Note! The MDCT takes 2N samples as input and gives N frequencies as output, but since each window has N samples in common with the previous window, for every N new samples, we get N frequencies Critical sampling.
26 Time-frequency transforms MDCT Magnitude Time Magnitude (db) Frequency bin Frequency bin Frequency bin
27 Time-frequency transforms MDCT The MDCT thus has all the desired properties; smooth transitions, critical sampling and well-defined frequency components. The MDCT is the most commonly used time-frequency transform in audio coding. Used in AAC, USAC, EVS etc. Prof. Edler was a central developer of MDCT. The only notable drawback with MDCT is that it is a real-valued transform. If a signal perfectly aligns with a basis function in one frame, then it will be perfectly orthogonal (=) to the basis function in the next frame. Amplitudes of MDCT components have much larger variance than the original signal. Physical interpretation of amplitudes is difficult/inefficient.
28 Time-frequency transforms We now have a frequency-representation of a window of the signal. Next step is to quantize and code the signal. To obtain perceptually uniform quantization noise, we scale the spectral components X k with the perceptual envelope W k. We thus quantize X k /W k. The quantized signal is then multiplied with W k to return to the original domain.
29 Time-frequency transforms MDCT Magnitude (db) Magnitude (db) Magnitude (db) (a) (b) (c) Frequency bin X k W k X k /W k X' k /W k X k X' k
30 Basics of entropy coding We now have a quantized spectrum and our objective is to transmit that spectrum with the lowest number of bits. This is known as lossless coding. Compression such that the original (quantized) spectrum can be exactly recovered. Each quantization level can be interpreted as a symbol. We have an alphabet of symbols. We need unique identifiers for each symbol in terms of bit-strings.
31 Basics of entropy coding Consider a three-symbol alphabet with symbols a, b and c. We can assign them unique binary strings, 1 and 11. The symbols can be then transmitted with 2 bits/symbol. However, this is already inefficient because in theory, for three symbols we would need only log bits/symbol on average. When constrained to fixed-length strings of bits we cannot do better.
32 Basics of entropy coding If we know from before the occurence-probabilities of each symbol, then we can do better. Consider the following probabilities and bit-strings Symbol Probability Code Bits a.5 1 b c Clearly each symbol has a unique bit-string. From a bit-string 111, we can easily decode bca. The average bit-rate is = 1.5, that is, on average we use 1.5 bits/symbol. This is known as Huffman coding and it works optimally when the probabilities are powers of.5.
33 Arithmetic coding In the general case, when probabilities are arbitrary numbers, we can use arithmetic coding. Consider the following alphabet and the corresponding probabilities Symbol Probability Interval a b c d e Here each symbol is mapped to a unique interval in [, 1].
34 Arithmetic coding a b c d e
35 Arithmetic coding If we then want to encode a sequence of symbols, for example, adc. The first symbol then lands into the interval....41, which we call the remaining range. The intervals for the second symbol are then mapped into the remaining range. The second symbol then lands into the interval The intervals for the third symbol are then mapped into the remaining range. The last symbol then lands into the interval To transmit the sequence of symbols adc is then equivalent with transmitting a binary code which uniquely identifies the interval
36 Arithmetic coding a b c d e a b c d e a b c d e
37 Arithmetic coding The average bit-consumption is then b = p(s) log 2 p(s). s Symbols In the above example the bit-consumption is then 2.12 bits/symbol.
38 Arithmetic coding To encode an interval we use a binary representation of decimal numbers. Let correspond to the interval....5 and 1 to The string 1 then corresponds to the interval and 11 to The string 1 then corresponds to the interval and 11 to etc. When we have a bit-string whose range is inside the desired range, then we are finished.
39 Arithmetic coding Encoding of the range : Final bit-string is 111. Decoder performs then decodes the intervals from the bit-string and maps them to symbols
40 Arithmetic coding Arithmetic coding is thus a form of entropy coding which takes an alphabet (quantization levels) and their occurence-probabilities, and encodes a sequence of symbols with the optimally low number of bits. In a practical system, we do not want to use a fixed alphabet-probability combination, but model the signal. We can then either use preceeding coefficients (known as the context) to predict the probability model of the current coefficient (USAC and EVS), or model the probabilities with, say, a Laplacian distribution and deduce the variance of samples from the spectral envelope shape (EVS).
41 Integration with CELP Above we have presented principles of frequency domain coding for speech codecs. For integration to a practical codec we need methods for switching between time- and frequency-domain coding. The windowing paradigm based on MDCT fits poorly with the filter-based windows of CELP. Must use engineering solutions aka hacks, to switch windowing concept. The characteristic distortions of time- and frequency-domain codecs are very different. Switching in the middle of a phoneme can become easily audible, because the character of artifacts change, even if absolute perceptual quality remains constant. Allow switching only at phoneme borders (requires advanced signal analysis).
42 Summary of frequency domain coding Frequency domain coding is effective for stationary signals such as music, background noises, and stationary segments of speech. Transform coded excitation (TCX) is a family of frequency-domain coding methods which use linear prediction as a model of the spectral envelope. Modern frequency-domain codecs are based on MDCT, which provides smooth transitions between windows, critical sampling and a physically well-defined transform. The spectrum is weighted with perceptual model to limit perceptual effect of quantization noise. Frequency components are encoded with an entropy codec to reduce bit-rate.
Compression. Encryption. Decryption. Decompression. Presentation of Information to client site
DOCUMENT Anup Basu Audio Image Video Data Graphics Objectives Compression Encryption Network Communications Decryption Decompression Client site Presentation of Information to client site Multimedia -
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationAdvanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals
Advanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical Engineering
More informationSignal Characteristics
Data Transmission The successful transmission of data depends upon two factors:» The quality of the transmission signal» The characteristics of the transmission medium Some type of transmission medium
More informationECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2
ECE 556 BASICS OF DIGITAL SPEECH PROCESSING Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2 Analog Sound to Digital Sound Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre
More informationVoice Activity Detection
Voice Activity Detection Speech Processing Tom Bäckström Aalto University October 2015 Introduction Voice activity detection (VAD) (or speech activity detection, or speech detection) refers to a class
More informationAudio Signal Compression using DCT and LPC Techniques
Audio Signal Compression using DCT and LPC Techniques P. Sandhya Rani#1, D.Nanaji#2, V.Ramesh#3,K.V.S. Kiran#4 #Student, Department of ECE, Lendi Institute Of Engineering And Technology, Vizianagaram,
More informationChapter 9 Image Compression Standards
Chapter 9 Image Compression Standards 9.1 The JPEG Standard 9.2 The JPEG2000 Standard 9.3 The JPEG-LS Standard 1IT342 Image Compression Standards The image standard specifies the codec, which defines how
More informationCopyright S. K. Mitra
1 In many applications, a discrete-time signal x[n] is split into a number of subband signals by means of an analysis filter bank The subband signals are then processed Finally, the processed subband signals
More informationITM 1010 Computer and Communication Technologies
ITM 1010 Computer and Communication Technologies Lecture #20 Review: Communication Technologies 2003 香港中文大學, 電子工程學系 (Prof. H.K.Tsang) ITM 1010 計算機與通訊技術 1 Review of Communication Technologies! Information
More informationModule 6 STILL IMAGE COMPRESSION STANDARDS
Module 6 STILL IMAGE COMPRESSION STANDARDS Lesson 16 Still Image Compression Standards: JBIG and JPEG Instructional Objectives At the end of this lesson, the students should be able to: 1. Explain the
More informationImages with (a) coding redundancy; (b) spatial redundancy; (c) irrelevant information
Images with (a) coding redundancy; (b) spatial redundancy; (c) irrelevant information 1992 2008 R. C. Gonzalez & R. E. Woods For the image in Fig. 8.1(a): 1992 2008 R. C. Gonzalez & R. E. Woods Measuring
More informationProblem Sheet 1 Probability, random processes, and noise
Problem Sheet 1 Probability, random processes, and noise 1. If F X (x) is the distribution function of a random variable X and x 1 x 2, show that F X (x 1 ) F X (x 2 ). 2. Use the definition of the cumulative
More informationCommunications IB Paper 6 Handout 3: Digitisation and Digital Signals
Communications IB Paper 6 Handout 3: Digitisation and Digital Signals Jossy Sayir Signal Processing and Communications Lab Department of Engineering University of Cambridge jossy.sayir@eng.cam.ac.uk Lent
More informationEC 6501 DIGITAL COMMUNICATION UNIT - II PART A
EC 6501 DIGITAL COMMUNICATION 1.What is the need of prediction filtering? UNIT - II PART A [N/D-16] Prediction filtering is used mostly in audio signal processing and speech processing for representing
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 14 Quiz 04 Review 14/04/07 http://www.ee.unlv.edu/~b1morris/ee482/
More informationChapter IV THEORY OF CELP CODING
Chapter IV THEORY OF CELP CODING CHAPTER IV THEORY OF CELP CODING 4.1 Introduction Wavefonn coders fail to produce high quality speech at bit rate lower than 16 kbps. Source coders, such as LPC vocoders,
More informationChapter 3 Data Transmission COSC 3213 Summer 2003
Chapter 3 Data Transmission COSC 3213 Summer 2003 Courtesy of Prof. Amir Asif Definitions 1. Recall that the lowest layer in OSI is the physical layer. The physical layer deals with the transfer of raw
More informationDigital Speech Processing and Coding
ENEE408G Spring 2006 Lecture-2 Digital Speech Processing and Coding Spring 06 Instructor: Shihab Shamma Electrical & Computer Engineering University of Maryland, College Park http://www.ece.umd.edu/class/enee408g/
More informationAudio Coding based on Integer Transforms
Audio Coding based on Integer Transforms Ralf Geiger, Thomas Sporer, Jürgen Koller, Karlheinz Brandenburg / Fraunhofer Institut für Integrierte Schaltungen, Arbeitsgruppe für Elektronische Medientechnologie
More informationIntroduction to Source Coding
Comm. 52: Communication Theory Lecture 7 Introduction to Source Coding - Requirements of source codes - Huffman Code Length Fixed Length Variable Length Source Code Properties Uniquely Decodable allow
More informationUnited Codec. 1. Motivation/Background. 2. Overview. Mofei Zhu, Hugo Guo, Deepak Music 422 Winter 09 Stanford University.
United Codec Mofei Zhu, Hugo Guo, Deepak Music 422 Winter 09 Stanford University March 13, 2009 1. Motivation/Background The goal of this project is to build a perceptual audio coder for reducing the data
More informationFilter Banks I. Prof. Dr. Gerald Schuller. Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany. Fraunhofer IDMT
Filter Banks I Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany 1 Structure of perceptual Audio Coders Encoder Decoder 2 Filter Banks essential element of most
More informationEntropy, Coding and Data Compression
Entropy, Coding and Data Compression Data vs. Information yes, not, yes, yes, not not In ASCII, each item is 3 8 = 24 bits of data But if the only possible answers are yes and not, there is only one bit
More informationEvaluation of Audio Compression Artifacts M. Herrera Martinez
Evaluation of Audio Compression Artifacts M. Herrera Martinez This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal
More informationThe Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.
The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D. Home The Book by Chapters About the Book Steven W. Smith Blog Contact Book Search Download this chapter in PDF
More informationECEn 665: Antennas and Propagation for Wireless Communications 131. s(t) = A c [1 + αm(t)] cos (ω c t) (9.27)
ECEn 665: Antennas and Propagation for Wireless Communications 131 9. Modulation Modulation is a way to vary the amplitude and phase of a sinusoidal carrier waveform in order to transmit information. When
More informationTelecommunication Electronics
Politecnico di Torino ICT School Telecommunication Electronics C5 - Special A/D converters» Logarithmic conversion» Approximation, A and µ laws» Differential converters» Oversampling, noise shaping Logarithmic
More informationSAMPLING THEORY. Representing continuous signals with discrete numbers
SAMPLING THEORY Representing continuous signals with discrete numbers Roger B. Dannenberg Professor of Computer Science, Art, and Music Carnegie Mellon University ICM Week 3 Copyright 2002-2013 by Roger
More informationspeech signal S(n). This involves a transformation of S(n) into another signal or a set of signals
16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 12 Speech Signal Processing 14/03/25 http://www.ee.unlv.edu/~b1morris/ee482/
More informationFundamentals of Digital Communication
Fundamentals of Digital Communication Network Infrastructures A.A. 2017/18 Digital communication system Analog Digital Input Signal Analog/ Digital Low Pass Filter Sampler Quantizer Source Encoder Channel
More informationCommunication Theory II
Communication Theory II Lecture 13: Information Theory (cont d) Ahmed Elnakib, PhD Assistant Professor, Mansoura University, Egypt March 22 th, 2015 1 o Source Code Generation Lecture Outlines Source Coding
More informationMULTIMEDIA SYSTEMS
1 Department of Computer Engineering, Faculty of Engineering King Mongkut s Institute of Technology Ladkrabang 01076531 MULTIMEDIA SYSTEMS Pk Pakorn Watanachaturaporn, Wt ht Ph.D. PhD pakorn@live.kmitl.ac.th,
More informationTimbral Distortion in Inverse FFT Synthesis
Timbral Distortion in Inverse FFT Synthesis Mark Zadel Introduction Inverse FFT synthesis (FFT ) is a computationally efficient technique for performing additive synthesis []. Instead of summing partials
More informationGolomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder
Golomb-Rice Coding Optimized via LPC for Frequency Domain Audio Coder Ryosue Sugiura, Yutaa Kamamoto, Noboru Harada, Hiroazu Kameoa and Taehiro Moriya Graduate School of Information Science and Technology,
More informationChapter 2: Digitization of Sound
Chapter 2: Digitization of Sound Acoustics pressure waves are converted to electrical signals by use of a microphone. The output signal from the microphone is an analog signal, i.e., a continuous-valued
More informationPulse Code Modulation
Pulse Code Modulation Modulation is the process of varying one or more parameters of a carrier signal in accordance with the instantaneous values of the message signal. The message signal is the signal
More informationEEE 309 Communication Theory
EEE 309 Communication Theory Semester: January 2016 Dr. Md. Farhad Hossain Associate Professor Department of EEE, BUET Email: mfarhadhossain@eee.buet.ac.bd Office: ECE 331, ECE Building Part 05 Pulse Code
More informationWavelet-based image compression
Institut Mines-Telecom Wavelet-based image compression Marco Cagnazzo Multimedia Compression Outline Introduction Discrete wavelet transform and multiresolution analysis Filter banks and DWT Multiresolution
More informationDigital Audio. Lecture-6
Digital Audio Lecture-6 Topics today Digitization of sound PCM Lossless predictive coding 2 Sound Sound is a pressure wave, taking continuous values Increase / decrease in pressure can be measured in amplitude,
More informationChapter 4 SPEECH ENHANCEMENT
44 Chapter 4 SPEECH ENHANCEMENT 4.1 INTRODUCTION: Enhancement is defined as improvement in the value or Quality of something. Speech enhancement is defined as the improvement in intelligibility and/or
More informationCSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued
CSCD 433 Network Programming Fall 2016 Lecture 5 Physical Layer Continued 1 Topics Definitions Analog Transmission of Digital Data Digital Transmission of Analog Data Multiplexing 2 Different Types of
More informationSound Synthesis Methods
Sound Synthesis Methods Matti Vihola, mvihola@cs.tut.fi 23rd August 2001 1 Objectives The objective of sound synthesis is to create sounds that are Musically interesting Preferably realistic (sounds like
More information10 Speech and Audio Signals
0 Speech and Audio Signals Introduction Speech and audio signals are normally converted into PCM, which can be stored or transmitted as a PCM code, or compressed to reduce the number of bits used to code
More informationTRANSFORMS / WAVELETS
RANSFORMS / WAVELES ransform Analysis Signal processing using a transform analysis for calculations is a technique used to simplify or accelerate problem solution. For example, instead of dividing two
More informationSNR Scalability, Multiple Descriptions, and Perceptual Distortion Measures
SNR Scalability, Multiple Descriptions, Perceptual Distortion Measures Jerry D. Gibson Department of Electrical & Computer Engineering University of California, Santa Barbara gibson@mat.ucsb.edu Abstract
More informationSampling and Pulse Code Modulation Chapter 6
Sampling and Pulse Code Modulation Chapter 6 Dr. Yun Q. Shi Dept of Electrical & Computer Engineering New Jersey Institute of Technology shi@njit.edu Sampling Theorem A Signal is said to be band-limited
More informationThe quality of the transmission signal The characteristics of the transmission medium. Some type of transmission medium is required for transmission:
Data Transmission The successful transmission of data depends upon two factors: The quality of the transmission signal The characteristics of the transmission medium Some type of transmission medium is
More informationDas, Sneha; Bäckström, Tom Postfiltering with Complex Spectral Correlations for Speech and Audio Coding
Powered by TCPDF (www.tcpdf.org) This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Das, Sneha; Bäckström, Tom Postfiltering
More informationHybrid Coding (JPEG) Image Color Transform Preparation
Hybrid Coding (JPEG) 5/31/2007 Kompressionsverfahren: JPEG 1 Image Color Transform Preparation Example 4: 2: 2 YUV, 4: 1: 1 YUV, and YUV9 Coding Luminance (Y): brightness sampling frequency 13.5 MHz Chrominance
More informationAudio and Speech Compression Using DCT and DWT Techniques
Audio and Speech Compression Using DCT and DWT Techniques M. V. Patil 1, Apoorva Gupta 2, Ankita Varma 3, Shikhar Salil 4 Asst. Professor, Dept.of Elex, Bharati Vidyapeeth Univ.Coll.of Engg, Pune, Maharashtra,
More informationEEE 309 Communication Theory
EEE 309 Communication Theory Semester: January 2017 Dr. Md. Farhad Hossain Associate Professor Department of EEE, BUET Email: mfarhadhossain@eee.buet.ac.bd Office: ECE 331, ECE Building Types of Modulation
More informationAnalog-Digital Interface
Analog-Digital Interface Tuesday 24 November 15 Summary Previous Class Dependability Today: Redundancy Error Correcting Codes Analog-Digital Interface Converters, Sensors / Actuators Sampling DSP Frequency
More informationAudio Signal Performance Analysis using Integer MDCT Algorithm
Audio Signal Performance Analysis using Integer MDCT Algorithm M.Davidson Kamala Dhas 1, R.Priyadharsini 2 1 Assistant Professor, Department of Electronics and Communication Engineering, Mepco Schelnk
More informationLaboratory Assignment 4. Fourier Sound Synthesis
Laboratory Assignment 4 Fourier Sound Synthesis PURPOSE This lab investigates how to use a computer to evaluate the Fourier series for periodic signals and to synthesize audio signals from Fourier series
More information2.1. General Purpose Run Length Encoding Relative Encoding Tokanization or Pattern Substitution
2.1. General Purpose There are many popular general purpose lossless compression techniques, that can be applied to any type of data. 2.1.1. Run Length Encoding Run Length Encoding is a compression technique
More informationAPPLICATIONS OF DSP OBJECTIVES
APPLICATIONS OF DSP OBJECTIVES This lecture will discuss the following: Introduce analog and digital waveform coding Introduce Pulse Coded Modulation Consider speech-coding principles Introduce the channel
More informationCommunications I (ELCN 306)
Communications I (ELCN 306) c Samy S. Soliman Electronics and Electrical Communications Engineering Department Cairo University, Egypt Email: samy.soliman@cu.edu.eg Website: http://scholar.cu.edu.eg/samysoliman
More informationAnalog and Telecommunication Electronics
Politecnico di Torino - ICT School Analog and Telecommunication Electronics D5 - Special A/D converters» Differential converters» Oversampling, noise shaping» Logarithmic conversion» Approximation, A and
More informationLECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR
1 LECTURE VI: LOSSLESS COMPRESSION ALGORITHMS DR. OUIEM BCHIR 2 STORAGE SPACE Uncompressed graphics, audio, and video data require substantial storage capacity. Storing uncompressed video is not possible
More informationStructure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping
Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics
More informationQUESTION BANK. SUBJECT CODE / Name: EC2301 DIGITAL COMMUNICATION UNIT 2
QUESTION BANK DEPARTMENT: ECE SEMESTER: V SUBJECT CODE / Name: EC2301 DIGITAL COMMUNICATION UNIT 2 BASEBAND FORMATTING TECHNIQUES 1. Why prefilterring done before sampling [AUC NOV/DEC 2010] The signal
More informationDEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK. Subject Name: Information Coding Techniques UNIT I INFORMATION ENTROPY FUNDAMENTALS
DEPARTMENT OF INFORMATION TECHNOLOGY QUESTION BANK Subject Name: Year /Sem: II / IV UNIT I INFORMATION ENTROPY FUNDAMENTALS PART A (2 MARKS) 1. What is uncertainty? 2. What is prefix coding? 3. State the
More informationSignals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2
Signals A Preliminary Discussion EE442 Analog & Digital Communication Systems Lecture 2 The Fourier transform of single pulse is the sinc function. EE 442 Signal Preliminaries 1 Communication Systems and
More informationDigital Signal Processing
Digital Signal Processing Fourth Edition John G. Proakis Department of Electrical and Computer Engineering Northeastern University Boston, Massachusetts Dimitris G. Manolakis MIT Lincoln Laboratory Lexington,
More informationDiscrete Fourier Transform (DFT)
Amplitude Amplitude Discrete Fourier Transform (DFT) DFT transforms the time domain signal samples to the frequency domain components. DFT Signal Spectrum Time Frequency DFT is often used to do frequency
More informationContinuous vs. Discrete signals. Sampling. Analog to Digital Conversion. CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals
Continuous vs. Discrete signals CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 22,
More informationPART I: The questions in Part I refer to the aliasing portion of the procedure as outlined in the lab manual.
Lab. #1 Signal Processing & Spectral Analysis Name: Date: Section / Group: NOTE: To help you correctly answer many of the following questions, it may be useful to actually run the cases outlined in the
More informationReview of Lecture 2. Data and Signals - Theoretical Concepts. Review of Lecture 2. Review of Lecture 2. Review of Lecture 2. Review of Lecture 2
Data and Signals - Theoretical Concepts! What are the major functions of the network access layer? Reference: Chapter 3 - Stallings Chapter 3 - Forouzan Study Guide 3 1 2! What are the major functions
More informationChapter 2: Signal Representation
Chapter 2: Signal Representation Aveek Dutta Assistant Professor Department of Electrical and Computer Engineering University at Albany Spring 2018 Images and equations adopted from: Digital Communications
More informationMultimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology
Course Presentation Multimedia Systems Entropy Coding Mahdi Amiri February 2011 Sharif University of Technology Data Compression Motivation Data storage and transmission cost money Use fewest number of
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationSolutions to Information Theory Exercise Problems 5 8
Solutions to Information Theory Exercise roblems 5 8 Exercise 5 a) n error-correcting 7/4) Hamming code combines four data bits b 3, b 5, b 6, b 7 with three error-correcting bits: b 1 = b 3 b 5 b 7, b
More informationNon-Uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes
Non-Uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes Petr Motlicek 12, Hynek Hermansky 123, Sriram Ganapathy 13, and Harinath Garudadri 4 1 IDIAP Research
More informationA Brief Introduction to Information Theory and Lossless Coding
A Brief Introduction to Information Theory and Lossless Coding 1 INTRODUCTION This document is intended as a guide to students studying 4C8 who have had no prior exposure to information theory. All of
More informationCSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued
CSCD 433 Network Programming Fall 2016 Lecture 5 Physical Layer Continued 1 Topics Definitions Analog Transmission of Digital Data Digital Transmission of Analog Data Multiplexing 2 Different Types of
More informationEE390 Final Exam Fall Term 2002 Friday, December 13, 2002
Name Page 1 of 11 EE390 Final Exam Fall Term 2002 Friday, December 13, 2002 Notes 1. This is a 2 hour exam, starting at 9:00 am and ending at 11:00 am. The exam is worth a total of 50 marks, broken down
More informationSpeech Compression Using Wavelet Transform
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 3, Ver. VI (May - June 2017), PP 33-41 www.iosrjournals.org Speech Compression Using Wavelet Transform
More informationCT111 Introduction to Communication Systems Lecture 9: Digital Communications
CT111 Introduction to Communication Systems Lecture 9: Digital Communications Yash M. Vasavada Associate Professor, DA-IICT, Gandhinagar 31st January 2018 Yash M. Vasavada (DA-IICT) CT111: Intro to Comm.
More informationMODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS
MODIFIED DCT BASED SPEECH ENHANCEMENT IN VEHICULAR ENVIRONMENTS 1 S.PRASANNA VENKATESH, 2 NITIN NARAYAN, 3 K.SAILESH BHARATHWAAJ, 4 M.P.ACTLIN JEEVA, 5 P.VIJAYALAKSHMI 1,2,3,4,5 SSN College of Engineering,
More informationDCSP-3: Minimal Length Coding. Jianfeng Feng
DCSP-3: Minimal Length Coding Jianfeng Feng Department of Computer Science Warwick Univ., UK Jianfeng.feng@warwick.ac.uk http://www.dcs.warwick.ac.uk/~feng/dcsp.html Automatic Image Caption (better than
More informationModule 8: Video Coding Basics Lecture 40: Need for video coding, Elements of information theory, Lossless coding. The Lecture Contains:
The Lecture Contains: The Need for Video Coding Elements of a Video Coding System Elements of Information Theory Symbol Encoding Run-Length Encoding Entropy Encoding file:///d /...Ganesh%20Rana)/MY%20COURSE_Ganesh%20Rana/Prof.%20Sumana%20Gupta/FINAL%20DVSP/lecture%2040/40_1.htm[12/31/2015
More informationOpen Access Improved Frame Error Concealment Algorithm Based on Transform- Domain Mobile Audio Codec
Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 527-535 527 Open Access Improved Frame Error Concealment Algorithm Based on Transform-
More information6.02 Practice Problems: Modulation & Demodulation
1 of 12 6.02 Practice Problems: Modulation & Demodulation Problem 1. Here's our "standard" modulation-demodulation system diagram: at the transmitter, signal x[n] is modulated by signal mod[n] and the
More informationLecture5: Lossless Compression Techniques
Fixed to fixed mapping: we encoded source symbols of fixed length into fixed length code sequences Fixed to variable mapping: we encoded source symbols of fixed length into variable length code sequences
More informationEND-OF-YEAR EXAMINATIONS ELEC321 Communication Systems (D2) Tuesday, 22 November 2005, 9:20 a.m. Three hours plus 10 minutes reading time.
END-OF-YEAR EXAMINATIONS 2005 Unit: Day and Time: Time Allowed: ELEC321 Communication Systems (D2) Tuesday, 22 November 2005, 9:20 a.m. Three hours plus 10 minutes reading time. Total Number of Questions:
More informationLecture 3 Review of Signals and Systems: Part 2. EE4900/EE6720 Digital Communications
EE4900/EE6720: Digital Communications 1 Lecture 3 Review of Signals and Systems: Part 2 Block Diagrams of Communication System Digital Communication System 2 Informatio n (sound, video, text, data, ) Transducer
More informationUniversity of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005
University of Washington Department of Electrical Engineering Computer Speech Processing EE516 Winter 2005 Lecture 5 Slides Jan 26 th, 2005 Outline of Today s Lecture Announcements Filter-bank analysis
More informationPulse Code Modulation (PCM)
Project Title: e-laboratories for Physics and Engineering Education Tempus Project: contract # 517102-TEMPUS-1-2011-1-SE-TEMPUS-JPCR 1. Experiment Category: Electrical Engineering >> Communications 2.
More informationFPGA implementation of DWT for Audio Watermarking Application
FPGA implementation of DWT for Audio Watermarking Application Naveen.S.Hampannavar 1, Sajeevan Joseph 2, C.B.Bidhul 3, Arunachalam V 4 1, 2, 3 M.Tech VLSI Students, 4 Assistant Professor Selection Grade
More informationUnderstanding Digital Signal Processing
Understanding Digital Signal Processing Richard G. Lyons PRENTICE HALL PTR PRENTICE HALL Professional Technical Reference Upper Saddle River, New Jersey 07458 www.photr,com Contents Preface xi 1 DISCRETE
More information8.3 Basic Parameters for Audio
8.3 Basic Parameters for Audio Analysis Physical audio signal: simple one-dimensional amplitude = loudness frequency = pitch Psycho-acoustic features: complex A real-life tone arises from a complex superposition
More informationCMPT 318: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals
CMPT 318: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 16, 2006 1 Continuous vs. Discrete
More informationLecture 3 Concepts for the Data Communications and Computer Interconnection
Lecture 3 Concepts for the Data Communications and Computer Interconnection Aim: overview of existing methods and techniques Terms used: -Data entities conveying meaning (of information) -Signals data
More informationCHAPTER 4. PULSE MODULATION Part 2
CHAPTER 4 PULSE MODULATION Part 2 Pulse Modulation Analog pulse modulation: Sampling, i.e., information is transmitted only at discrete time instants. e.g. PAM, PPM and PDM Digital pulse modulation: Sampling
More informationSignal Processing Toolbox
Signal Processing Toolbox Perform signal processing, analysis, and algorithm development Signal Processing Toolbox provides industry-standard algorithms for analog and digital signal processing (DSP).
More informationTemplates and Image Pyramids
Templates and Image Pyramids 09/07/17 Computational Photography Derek Hoiem, University of Illinois Why does a lower resolution image still make sense to us? What do we lose? Image: http://www.flickr.com/photos/igorms/136916757/
More informationCHAPTER. delta-sigma modulators 1.0
CHAPTER 1 CHAPTER Conventional delta-sigma modulators 1.0 This Chapter presents the traditional first- and second-order DSM. The main sources for non-ideal operation are described together with some commonly
More information