Quarterly Progress and Status Report. Speech synthesizer control by smoothed step functions
|
|
- Juliet Anastasia Bruce
- 5 years ago
- Views:
Transcription
1 Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Speech synthesizer control by smoothed step functions Liljencrants, J. journal: STL-QPSR volume: 10 number: 4 year: 1969 pages:
2
3 . SPEECH SYNTHESS A, SPEECH SYNTHESZER CONTROL BY SMOOTHED STEP FUNCTONS J. Liljencrants t is an appealing notion that the movements of the speech organs may at some level in the control chain be initiated by step commands. Clearly, many muscular movements, not only in the speech apparatus, can be described as responses of an inertial system to a more or less complex set of step forces. The inertia is then not only mechanical but also due to neural propagation and other delays. The speech organ movements are eventually manifested in the appearance and movements of the formants in the speech wave. The experiment to be described here is a drastic shortcut across the whole set of nonlinear transformations from imaginary stepwise muscular excitations, over movements and area functions to the speech wave. Thus the principle here used is to operate with the formant parameters themselves as being well behaved step responses. Of course one may then not hope for more than a moderate approximation to the natural speech, but the method is very well suited for a technical implementation of a synthesis by rule system. The setup for the experiment coasists of a CDC 1700 computer interfaced with the OVE 11 serial formant synthesizer and various equipment for operator control and monitoring. The initial work is to build up a library of typical formant frequency and excitation level values. For this purpose the operator works with a handle that can be moved over a plane surface. The handle has two sensors to convey its location to the computer which plots a mark at the pertinent coordinates on a display oscilloscope. The plot on the oscilloscope shows as a time-frequency diagram the synthesis parameters in the stylized square wave form shown in Fig. 111-A-1. A set of program control commands are displayed at the edge of the plot. By pointing at these using the handle the operator can initiate such things as to insert, delete, or move data points, and select parameters to display. The operator can now devise a pattern on the screen. f the special duration parameter 3R has been given two data points indicating the nom- inal beginning and end of the pattern it may be stored as a library item in
4 1.p sec Fig. 111-A-1. Operator's plot showing vowel synthesis parameters in square wave form. The operator has just put the mark on the command word CHNP in order to initiate a change in a point. He has then selected a point on the FO contour. A line goes from this point to the mark. mmediately above the mark the parameter name and the current location of the mark is displayed. Should the operator push the enable button for the handle the point will be moved to the location of the mark.
5 STL-QPSR 4/ the program. This is done from the computer keyboard where it ;hguld also be given a name with any one or tws characters, Each of these li- brary samples contain the following infor-mation: a. The identifying name of the sample. b. A pointer to the next sample in the library, also giving the length of the current sample. f this pointer is zero it indicates that the current sample is a dummy defining the end of the library. c. A set of pointers, one for each parameter. Each of these pointers indicate the location within the sample of the value specifications for the parameter. d. The nominal duration of the sample. e. A table of variable size with the parameter values. Each entry here is one 16 bit machine word, and corresponds to a data point in the plot. The first half of the word is the time relative to the beginning of the sample, and the second half is the frequency or level value, - The frequency codes stored are on a logarithmic basis with a 3% fre- - quency increment to conform with the synthesizer hardware. The syn- thesis parameters and ranges used are given for reference in Table 111-A-. t should be noted that the smoothing operations discussed below are per- formed on this logarithmic frequency scale. The quantizing step of the time scale is 20 msec. A very important feature is that the time coordinate may define points both before and after the nominal time interval of the sample. Giving commands from the computer keyboard the operat~r can call down a library sample to the working area or the program, modify it using the handle, and reinsert it in the library, possibly with a different name. Within practical limits, set by core storage size and plot complexity, a library sample may contain an arbitrary number of data points. cases it is desirable to omit specification of certain parameters. n many f the operator after a special command types a sequence of library entry names the corresponding information is assembled. The relative time scale of the library patterns is then converted to absolute making use of the duration values of the samples. n this process it often happens that data points in one sample come later than points in the following sample. When this is the case for a parameter the square wave will make a twist backwards in time, The convention used in the following
6 0 ' Pitch STL-GPSR 4/ treatment of these cases is that the data points are used in sequence. When a time reversal occurs the time value of the new sample is neglec- ted, but its parameter value is taken. n case the folded back portion contains more than one point the intermediate points are skipped over. ~0hJ- The final step prior to controlling the synthesizer is not to smooth the square wave pattern. 7. This is done using second order lowpass filters, simulated in the computer using the z-transform technique. The filters have a small overshoot in their step response with the complex frequency poles at (-0.781fj0.625) fo where f is the 3 db cutoff frequency. The 0 choice of second order filtering was arbitrary, but would apply to mechanic- al systems of the simple spring-mass-resistance type. TABLE 111-A- 1 OVE 11 Control Parameters Name Address Code Data Bit No. Range nc re - ment Remarks FO F1 F2 F3 A0 AC Hz Hz Hz Hz 32 db 28 db 37'0 3% 3% 3% 0.5 db 4 db fundamental i Vowel formants Vowel level Fricative level AH AN FN FH KO K1 K2 B1 82 B3 B db 24 db Hz Hz Hz HZ Hz 4 db. Aspirative level 8 db Nasal level 127'0. Nasal formant 12% F4 and part of KH 3% Fricativeantiformant 1 3% Fricative formants 3% 100 HZ 100 Hz Vowel formant 200 Hz bandwidths 200 Hz For optional addenda to the circuits A From ref. (3).
7 For each synthesis parameter the smoothing time constant is invari- - able, but it may differ substantially between the different parameters. Thus the excitation level parameters are given short time constants, the - formant parameters longer, and the pitch parameter the longest. n the program these time constants may be arbitrarily adjusted in octave steps by the appropriate setting of a table of masks. The output to the synthesizer is done with 10 msec intervals using an external interrupt clock for the timing of the program. As an alternative to this real time output the control signals may be stored as a binary re- cord on the disc storage. These data can then later be used to control a separate program that simulates the synthesizer and stores the c clmputed speech output, also on the disc, from where it may ultimately be plzyed back at the correct speed. An example ~f such a synthesis is sh3wn in the spectrogram of Fig. 111-A-3. When the operator initiates synthesis with a chain of characters indicating library items, then only the char- acter string is stored. The actual extraction of data and smoothing is done during the output using interlaced buffering. Thus relatively long coherent utterances may be synthesized without using an excessive a- mount of storage capacity. 3uring these output operations the computer is busy between 30 and 40 percent ~f the time while the remainder is spent idle waiting for interrupt signals. A tentative library has been prepared for the speech sounds used in Swedish. Most of the data have been extracted by manual measurement and visual interpolation from a set of spectragrams. The speech material used was CV utterances with all Swedish consanants and the vowels [i:] [a:][u:], all pronounced by a single, phonetically trained speaker, The library was set up 3n a phoneme basis. To minimize the number of characters in the input strings the phmetic codes have been restricted to one character when possible, otherwise two. The mixture of one and two character codes will necessitate a character delay in the input routine so that the correct choice can be made when the library names are searched. Obviously some care must be exer3i3e:l in the selection of codes to amid ambiguities. This d~es n:jt seem ta offer any special difficulties, and g ~od mnemotechnic aid is given by common orthogr:.~hic notations.
8 5000 HZ sec db F1 FO * 1 AC, A0 - Fig. 111-A-2. Top: Concatenated standard samples for a word. The vowel synthesis parameters and the fricative excitation level AC are shown. At some places, most clearly between S and Y, a formant specification of a sound temporarily overrules that of the next sound. This rrhows as a twist backwards in the square waves. Bottom: The same after smoothing, ready for output to the synthesizer. The smoothing time constants differ between parameters. The accent 2 pitch contour is derived from a single standard pattern called for with the character ".
9 Fig. 111-A-3. Spectrogram of a rhort rentence. The ryntheriser control parameterr are rmoothed rtep functionr. Here the rynthericer war rimulated on the computer..-
10
11 STL-QPSR 4/ , input. tion, The intonation contour is generated from special characters in the These characters denote library samples with zero nominal dura- The only other parameter specified here is FO, and the specification covers an interval of the order of 500 msec. Since none of the phoneme specifications contain any F O information, the F O pattern is superimposed without any interaction. At current it seems quite satisfactory to work with as little as two different word intonation patterns, one for each of the accent 1 and accent 2 in Swedish. The sample corresponding to the first is a single rectangular pulse, and the other two pulses of which the second C is somewhat higher, Using these two alternatives the intonation patterns generated are fairly agreeable when the intonation markers are placed initially in the stressed words. However, when longer sentences are joined the result does become rather monotone. sentence tone parameter was introduced. brary sample where a single rectangular pulse is given. To relieve this a special t is specified in only one li- When used this pulse is smoothed with a very long time constant, one second or more, and the result is added to the regular F3 contour. n the synthesis work done with this principle so far the input has been in the form of a conventional phonetic transcription, where the actual character set of course is different due to the computer typewriter lirnita- tion. Apart from the insertion of intonation stress markers the following simple rules have been employed: 1. A stressed long vowel is made twice as long as normal by double typing. 2. A stressed short vowel is unchanged, but the consonant following it is made twice as long as normal. Should the consonant in question be a plosive the first item in the double typing is substituted for a space, indicating a silent interval. Application 3f these rules might as an example give SN 'THEJ v TK'TOAO-AKGZHAVQ r/ 'DJUUBQSPQRSAyOONALLT for the words "synthetic talkers have a dubious personality". The phonetic codes used should be evident. The two character codes are underlined. There are a number of obvious limitations with this scheme for mechanic- al synthesis. One may first c~me to think of the formant transition time canstant that in natural speech are rather far from being invariant. Specially one might consider the labial plosives where the transition rate is often very high. To some extent these matters can be taken care of by
12 STL-CPSR 4/1969 the proper adjustment of the step timing. Another possibility to speed up transiti~ns out from the smoothing filters is to feed them with high narrow pulses superimposed on the steps. t is however difficult to unite such operations with the demand for context independent standard control pat- terns. Also, the present system does not allow for the complex nature of formant transitions due to compound mechanical movements frequently encountered in human speech (see Fant ()). The /g/ and /k/ plosion loci are rather dependent of the following vowel. For this reason the experi- mental library ariginally contained frr~nt and back variants of these sounds. This however did not seem to give a significant impravement over the use of only the front variant. is too low ta allow for this refinement. Perhaps the merall naturalness of the synthesis Another undesirable consequence of the mechanical concatenation is that the syllable duration is unduly influenced by the number of phonemes per syllable. t has been proposed that, as an intermediate step be- tween the concatenation and smoothing, all syllables should be stretched to an equal duration, The criterion of a syllable start might then be the onset of the voicing. be somewhat prolongated, Pfter this operation the stressed syllables should This could conveniently be performed under control of the word intonation parameter. nitial experiments in this di- rection show promising results, The synthesis control procedure autlined has a certain interest for computer vocal response applications. A system might be based on a syn- thesizer controlled by the outputs from hardware lowpass filters. Should the computer supply the information of the unsmoothed square waves the data rate at this point may be estimated as 10 (phon/sec) x6 (param. used) 44(param. select.)+ 5(data bits)) = 540 bits/sec, t also seems quite practicable to pr~vide the library samples from a read-only memory with associated hardware. n that case only the phonetic character string has to be supplied and the data rate goes down to the order of 60 bits/sec. n both cases it is believed that even moderate computers could control several synthesizers in time sharing. The cgntrol system described is of course not limited to the production of speechlike signals. t has also been successfully operated far the pro- duction 3f synthetic music.
13 STL-QPSR 4/1969 Approximate program size (CDC 1700): Operator interface (keyboard, control handle, ~lots) Library service (modifications, listing) Communication with external programs (data transfers) Concatenation, smoothing, and synthesizer output Auxiliary tables Phoneme data library Buffer storage areas Written in machine code words 2, ,700 7,850 words References: (1) Fant, G.: "Stops in CV-Syllables", STL-CPSR 4/1969, pp This issue). (2) Lindblom, B., personal communication. (3) Liljencrants, J. : "The OVE 11 Speech Synthesizer", EEE Trans. on Audio and Electroacoustics, AU-16 (1968), pp
Quarterly Progress and Status Report. Formant amplitude measurements
Dept. for Speech, Music and Hearing Quarterly rogress and Status Report Formant amplitude measurements Fant, G. and Mártony, J. journal: STL-QSR volume: 4 number: 1 year: 1963 pages: 001-005 http://www.speech.kth.se/qpsr
More informationQuarterly Progress and Status Report. Synthesis of selected VCV-syllables in singing
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Synthesis of selected VCV-syllables in singing Zera, J. and Gauffin, J. and Sundberg, J. journal: STL-QPSR volume: 25 number: 2-3
More informationQuarterly Progress and Status Report. Mimicking and perception of synthetic vowels, part II
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Mimicking and perception of synthetic vowels, part II Chistovich, L. and Fant, G. and de Serpa-Leitao, A. journal: STL-QPSR volume:
More informationEE 225D LECTURE ON SPEECH SYNTHESIS. University of California Berkeley
University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Speech Synthesis Spring,1999 Lecture 23 N.MORGAN
More informationBlock diagram of proposed general approach to automatic reduction of speech wave to lowinformation-rate signals.
XIV. SPEECH COMMUNICATION Prof. M. Halle G. W. Hughes J. M. Heinz Prof. K. N. Stevens Jane B. Arnold C. I. Malme Dr. T. T. Sandel P. T. Brady F. Poza C. G. Bell O. Fujimura G. Rosen A. AUTOMATIC RESOLUTION
More informationQuarterly Progress and Status Report. The 51-channel spectrum analyzer - a status report
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report The 51-channel spectrum analyzer - a status report Garpendahl, G. and Liljencrants, J. and Rengman, U. journal: STL-QPSR volume:
More informationProject 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing
Project : Part 2 A second hands-on lab on Speech Processing Frequency-domain processing February 24, 217 During this lab, you will have a first contact on frequency domain analysis of speech signals. You
More informationIMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES. P. K. Lehana and P. C. Pandey
Workshop on Spoken Language Processing - 2003, TIFR, Mumbai, India, January 9-11, 2003 149 IMPROVING QUALITY OF SPEECH SYNTHESIS IN INDIAN LANGUAGES P. K. Lehana and P. C. Pandey Department of Electrical
More informationDigitized signals. Notes on the perils of low sample resolution and inappropriate sampling rates.
Digitized signals Notes on the perils of low sample resolution and inappropriate sampling rates. 1 Analog to Digital Conversion Sampling an analog waveform Sample = measurement of waveform amplitude at
More informationConverting Speaking Voice into Singing Voice
Converting Speaking Voice into Singing Voice 1 st place of the Synthesis of Singing Challenge 2007: Vocal Conversion from Speaking to Singing Voice using STRAIGHT by Takeshi Saitou et al. 1 STRAIGHT Speech
More informationSpeech Synthesis; Pitch Detection and Vocoders
Speech Synthesis; Pitch Detection and Vocoders Tai-Shih Chi ( 冀泰石 ) Department of Communication Engineering National Chiao Tung University May. 29, 2008 Speech Synthesis Basic components of the text-to-speech
More informationSPEECH AND SPECTRAL ANALYSIS
SPEECH AND SPECTRAL ANALYSIS 1 Sound waves: production in general: acoustic interference vibration (carried by some propagation medium) variations in air pressure speech: actions of the articulatory organs
More informationQuarterly Progress and Status Report. On certain irregularities of voiced-speech waveforms
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report On certain irregularities of voiced-speech waveforms Dolansky, L. and Tjernlund, P. journal: STL-QPSR volume: 8 number: 2-3 year:
More informationAcoustic Phonetics. Chapter 8
Acoustic Phonetics Chapter 8 1 1. Sound waves Vocal folds/cords: Frequency: 300 Hz 0 0 0.01 0.02 0.03 2 1.1 Sound waves: The parts of waves We will be considering the parts of a wave with the wave represented
More informationLinguistic Phonetics. Spectral Analysis
24.963 Linguistic Phonetics Spectral Analysis 4 4 Frequency (Hz) 1 Reading for next week: Liljencrants & Lindblom 1972. Assignment: Lip-rounding assignment, due 1/15. 2 Spectral analysis techniques There
More informationX. SPEECH ANALYSIS. Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER
X. SPEECH ANALYSIS Prof. M. Halle G. W. Hughes H. J. Jacobsen A. I. Engel F. Poza A. VOWEL IDENTIFIER Most vowel identifiers constructed in the past were designed on the principle of "pattern matching";
More informationReading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.
L105/205 Phonetics Scarborough Handout 7 10/18/05 Reading: Johnson Ch.2.3.3-2.3.6, Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday Spectral Analysis 1. There are
More informationLab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels
Lab 8. ANALYSIS OF COMPLEX SOUNDS AND SPEECH ANALYSIS Amplitude, loudness, and decibels A complex sound with particular frequency can be analyzed and quantified by its Fourier spectrum: the relative amplitudes
More informationHST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007
MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationSPEECH TO SINGING SYNTHESIS SYSTEM. Mingqing Yun, Yoon mo Yang, Yufei Zhang. Department of Electrical and Computer Engineering University of Rochester
SPEECH TO SINGING SYNTHESIS SYSTEM Mingqing Yun, Yoon mo Yang, Yufei Zhang Department of Electrical and Computer Engineering University of Rochester ABSTRACT This paper describes a speech-to-singing synthesis
More informationQuantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation
Quantification of glottal and voiced speech harmonicsto-noise ratios using cepstral-based estimation Peter J. Murphy and Olatunji O. Akande, Department of Electronic and Computer Engineering University
More informationBetween physics and perception signal models for high level audio processing. Axel Röbel. Analysis / synthesis team, IRCAM. DAFx 2010 iem Graz
Between physics and perception signal models for high level audio processing Axel Röbel Analysis / synthesis team, IRCAM DAFx 2010 iem Graz Overview Introduction High level control of signal transformation
More informationCOMP 546, Winter 2017 lecture 20 - sound 2
Today we will examine two types of sounds that are of great interest: music and speech. We will see how a frequency domain analysis is fundamental to both. Musical sounds Let s begin by briefly considering
More informationINTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)
INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) Proceedings of the 2 nd International Conference on Current Trends in Engineering and Management ICCTEM -214 ISSN
More informationAcoustic Phonetics. How speech sounds are physically represented. Chapters 12 and 13
Acoustic Phonetics How speech sounds are physically represented Chapters 12 and 13 1 Sound Energy Travels through a medium to reach the ear Compression waves 2 Information from Phonetics for Dummies. William
More informationVancura Innovations, LLC. Imagination is more important than knowledge until you have to build what you ve imagined. Channel Talker 4DAP
Vancura Innovations, LLC Imagination is more important than knowledge until you have to build what you ve imagined. Channel Talker 4DAP Test Tone Generator With Two Digital Stereo Channels, Four Analog
More informationSpeech Processing. Undergraduate course code: LASC10061 Postgraduate course code: LASC11065
Speech Processing Undergraduate course code: LASC10061 Postgraduate course code: LASC11065 All course materials and handouts are the same for both versions. Differences: credits (20 for UG, 10 for PG);
More informationGlottal source model selection for stationary singing-voice by low-band envelope matching
Glottal source model selection for stationary singing-voice by low-band envelope matching Fernando Villavicencio Yamaha Corporation, Corporate Research & Development Center, 3 Matsunokijima, Iwata, Shizuoka,
More informationStructure of Speech. Physical acoustics Time-domain representation Frequency domain representation Sound shaping
Structure of Speech Physical acoustics Time-domain representation Frequency domain representation Sound shaping Speech acoustics Source-Filter Theory Speech Source characteristics Speech Filter characteristics
More informationAspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification. Daryush Mehta
Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-Scale Modification Daryush Mehta SHBT 03 Research Advisor: Thomas F. Quatieri Speech and Hearing Biosciences and Technology 1 Summary Studied
More informationImproving Sound Quality by Bandwidth Extension
International Journal of Scientific & Engineering Research, Volume 3, Issue 9, September-212 Improving Sound Quality by Bandwidth Extension M. Pradeepa, M.Tech, Assistant Professor Abstract - In recent
More informationSource-filter analysis of fricatives
24.915/24.963 Linguistic Phonetics Source-filter analysis of fricatives Figure removed due to copyright restrictions. Readings: Johnson chapter 5 (speech perception) 24.963: Fujimura et al (1978) Noise
More informationLaboratory Assignment 2 Signal Sampling, Manipulation, and Playback
Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback PURPOSE This lab will introduce you to the laboratory equipment and the software that allows you to link your computer to the hardware.
More informationNOTES FOR THE SYLLABLE-SIGNAL SYNTHESIS METHOD: TIPW
NOTES FOR THE SYLLABLE-SIGNAL SYNTHESIS METHOD: TIPW Hung-Yan GU Department of EE, National Taiwan University of Science and Technology 43 Keelung Road, Section 4, Taipei 106 E-mail: root@guhy.ee.ntust.edu.tw
More informationMAKE SOMETHING THAT TALKS?
MAKE SOMETHING THAT TALKS? Modeling the Human Vocal Tract pitch, timing, and formant control signals pitch, timing, and formant control signals lips, teeth, and tongue formant cavity 2 formant cavity 1
More informationSpeech Synthesis using Mel-Cepstral Coefficient Feature
Speech Synthesis using Mel-Cepstral Coefficient Feature By Lu Wang Senior Thesis in Electrical Engineering University of Illinois at Urbana-Champaign Advisor: Professor Mark Hasegawa-Johnson May 2018 Abstract
More informationQuarterly Progress and Status Report. Phase dependent pitch sensation
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Phase dependent pitch sensation Shupljakov, V. and Murray, T. and Liljencrants, J. journal: STL-QPSR volume: 9 number: 4 year: 1968
More informationCHAPTER 5. Digitized Audio Telemetry Standard. Table of Contents
CHAPTER 5 Digitized Audio Telemetry Standard Table of Contents Chapter 5. Digitized Audio Telemetry Standard... 5-1 5.1 General... 5-1 5.2 Definitions... 5-1 5.3 Signal Source... 5-1 5.4 Encoding/Decoding
More informationSGN Audio and Speech Processing
Introduction 1 Course goals Introduction 2 SGN 14006 Audio and Speech Processing Lectures, Fall 2014 Anssi Klapuri Tampere University of Technology! Learn basics of audio signal processing Basic operations
More informationSynthesis Algorithms and Validation
Chapter 5 Synthesis Algorithms and Validation An essential step in the study of pathological voices is re-synthesis; clear and immediate evidence of the success and accuracy of modeling efforts is provided
More informationPattern Recognition. Part 6: Bandwidth Extension. Gerhard Schmidt
Pattern Recognition Part 6: Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory
More informationQuarterly Progress and Status Report. A note on the vocal tract wall impedance
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report A note on the vocal tract wall impedance Fant, G. and Nord, L. and Branderud, P. journal: STL-QPSR volume: 17 number: 4 year: 1976
More informationEE 264 DSP Project Report
Stanford University Winter Quarter 2015 Vincent Deo EE 264 DSP Project Report Audio Compressor and De-Esser Design and Implementation on the DSP Shield Introduction Gain Manipulation - Compressors - Gates
More informationNOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or
NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS: The copyright law of the United States (title 17, U.S. Code) governs the making of photocopies or other reproductions of copyrighted material. Any copying
More informationFinal Project Specification MIDI Sound Synthesizer Version 0.5
University of California at Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Computer Science Division CS 150 Spring 2002 Final Project Specification MIDI Sound
More informationINTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006
1. Resonators and Filters INTRODUCTION TO ACOUSTIC PHONETICS 2 Hilary Term, week 6 22 February 2006 Different vibrating objects are tuned to specific frequencies; these frequencies at which a particular
More informationQuarterly Progress and Status Report. Acoustic properties of the Rothenberg mask
Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Acoustic properties of the Rothenberg mask Hertegård, S. and Gauffin, J. journal: STL-QPSR volume: 33 number: 2-3 year: 1992 pages:
More informationHCS 7367 Speech Perception
HCS 7367 Speech Perception Dr. Peter Assmann Fall 212 Power spectrum model of masking Assumptions: Only frequencies within the passband of the auditory filter contribute to masking. Detection is based
More informationPlaits. Macro-oscillator
Plaits Macro-oscillator A B C D E F About Plaits Plaits is a digital voltage-controlled sound source capable of sixteen different synthesis techniques. Plaits reclaims the land between all the fragmented
More informationSpeech Recognition. Mitch Marcus CIS 421/521 Artificial Intelligence
Speech Recognition Mitch Marcus CIS 421/521 Artificial Intelligence A Sample of Speech Recognition Today's class is about: First, why speech recognition is difficult. As you'll see, the impression we have
More information14 fasttest. Multitone Audio Analyzer. Multitone and Synchronous FFT Concepts
Multitone Audio Analyzer The Multitone Audio Analyzer (FASTTEST.AZ2) is an FFT-based analysis program furnished with System Two for use with both analog and digital audio signals. Multitone and Synchronous
More informationNOISE ESTIMATION IN A SINGLE CHANNEL
SPEECH ENHANCEMENT FOR CROSS-TALK INTERFERENCE by Levent M. Arslan and John H.L. Hansen Robust Speech Processing Laboratory Department of Electrical Engineering Box 99 Duke University Durham, North Carolina
More informationEnhanced Waveform Interpolative Coding at 4 kbps
Enhanced Waveform Interpolative Coding at 4 kbps Oded Gottesman, and Allen Gersho Signal Compression Lab. University of California, Santa Barbara E-mail: [oded, gersho]@scl.ece.ucsb.edu Signal Compression
More informationINTRODUCTION TO COMMUNICATION SYSTEMS LABORATORY IV. Binary Pulse Amplitude Modulation and Pulse Code Modulation
INTRODUCTION TO COMMUNICATION SYSTEMS Introduction: LABORATORY IV Binary Pulse Amplitude Modulation and Pulse Code Modulation In this lab we will explore some of the elementary characteristics of binary
More informationLecture 7 Frequency Modulation
Lecture 7 Frequency Modulation Fundamentals of Digital Signal Processing Spring, 2012 Wei-Ta Chu 2012/3/15 1 Time-Frequency Spectrum We have seen that a wide range of interesting waveforms can be synthesized
More informationSOUND SOURCE RECOGNITION AND MODELING
SOUND SOURCE RECOGNITION AND MODELING CASA seminar, summer 2000 Antti Eronen antti.eronen@tut.fi Contents: Basics of human sound source recognition Timbre Voice recognition Recognition of environmental
More informationspeech signal S(n). This involves a transformation of S(n) into another signal or a set of signals
16 3. SPEECH ANALYSIS 3.1 INTRODUCTION TO SPEECH ANALYSIS Many speech processing [22] applications exploits speech production and perception to accomplish speech analysis. By speech analysis we extract
More informationBrief review of the concept and practice of third octave spectrum analysis
Low frequency analyzers based on digital signal processing - especially the Fast Fourier Transform algorithm - are rapidly replacing older analog spectrum analyzers for a variety of measurement tasks.
More informationAdvanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals
Advanced Digital Signal Processing Part 2: Digital Processing of Continuous-Time Signals Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical Engineering
More informationCommunications Theory and Engineering
Communications Theory and Engineering Master's Degree in Electronic Engineering Sapienza University of Rome A.A. 2018-2019 Speech and telephone speech Based on a voice production model Parametric representation
More informationL19: Prosodic modification of speech
L19: Prosodic modification of speech Time-domain pitch synchronous overlap add (TD-PSOLA) Linear-prediction PSOLA Frequency-domain PSOLA Sinusoidal models Harmonic + noise models STRAIGHT This lecture
More informationChapter 3. Description of the Cascade/Parallel Formant Synthesizer. 3.1 Overview
Chapter 3 Description of the Cascade/Parallel Formant Synthesizer The Klattalk system uses the KLSYN88 cascade-~arallel formant synthesizer that was first described in Klatt and Klatt (1990). This speech
More informationSampling and Reconstruction
Experiment 10 Sampling and Reconstruction In this experiment we shall learn how an analog signal can be sampled in the time domain and then how the same samples can be used to reconstruct the original
More informationSGN Audio and Speech Processing
SGN 14006 Audio and Speech Processing Introduction 1 Course goals Introduction 2! Learn basics of audio signal processing Basic operations and their underlying ideas and principles Give basic skills although
More informationLocation of sound source and transfer functions
Location of sound source and transfer functions Sounds produced with source at the larynx either voiced or voiceless (aspiration) sound is filtered by entire vocal tract Transfer function is well modeled
More informationA DEVICE FOR AUTOMATIC SPEECH RECOGNITION*
EVICE FOR UTOTIC SPEECH RECOGNITION* ats Blomberg and Kjell Elenius INTROUCTION In the following a device for automatic recognition of isolated words will be described. It was developed at The department
More informationVOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL
VOICE QUALITY SYNTHESIS WITH THE BANDWIDTH ENHANCED SINUSOIDAL MODEL Narsimh Kamath Vishweshwara Rao Preeti Rao NIT Karnataka EE Dept, IIT-Bombay EE Dept, IIT-Bombay narsimh@gmail.com vishu@ee.iitb.ac.in
More informationNature of Noise source. soundsc (noise, 10000);
Noise Sources Voiceless aspiration can be produced with a noise source at the glottis. (also for voiceless sonorants, including vowels) Noise source that is filtered through VT cascade, so some resonance
More informationEpoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE
1602 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 16, NO. 8, NOVEMBER 2008 Epoch Extraction From Speech Signals K. Sri Rama Murty and B. Yegnanarayana, Senior Member, IEEE Abstract
More informationSpeech Perception Speech Analysis Project. Record 3 tokens of each of the 15 vowels of American English in bvd or hvd context.
Speech Perception Map your vowel space. Record tokens of the 15 vowels of English. Using LPC and measurements on the waveform and spectrum, determine F0, F1, F2, F3, and F4 at 3 points in each token plus
More informationMultirate Signal Processing Lecture 7, Sampling Gerald Schuller, TU Ilmenau
Multirate Signal Processing Lecture 7, Sampling Gerald Schuller, TU Ilmenau (Also see: Lecture ADSP, Slides 06) In discrete, digital signal we use the normalized frequency, T = / f s =: it is without a
More informationENSEMBLE String Synthesizer
ENSEMBLE String Synthesizer by Max for Cats (+ Chorus Ensemble & Ensemble Phaser) Thank you for purchasing the Ensemble Max for Live String Synthesizer. Ensemble was inspired by the string machines from
More informationIIR Filter Design Chapter Intended Learning Outcomes: (i) Ability to design analog Butterworth filters
IIR Filter Design Chapter Intended Learning Outcomes: (i) Ability to design analog Butterworth filters (ii) Ability to design lowpass IIR filters according to predefined specifications based on analog
More informationLecture 5: Sinusoidal Modeling
ELEN E4896 MUSIC SIGNAL PROCESSING Lecture 5: Sinusoidal Modeling 1. Sinusoidal Modeling 2. Sinusoidal Analysis 3. Sinusoidal Synthesis & Modification 4. Noise Residual Dan Ellis Dept. Electrical Engineering,
More informationMel Spectrum Analysis of Speech Recognition using Single Microphone
International Journal of Engineering Research in Electronics and Communication Mel Spectrum Analysis of Speech Recognition using Single Microphone [1] Lakshmi S.A, [2] Cholavendan M [1] PG Scholar, Sree
More informationPsychology of Language
PSYCH 150 / LIN 155 UCI COGNITIVE SCIENCES syn lab Psychology of Language Prof. Jon Sprouse 01.10.13: The Mental Representation of Speech Sounds 1 A logical organization For clarity s sake, we ll organize
More information) #(2/./53 $!4! 42!.3-)33)/.!4! $!4! 3)'.!,,).' 2!4% ()'(%2 4(!. KBITS 53).' K(Z '2/50 "!.$ #)2#5)43
INTERNATIONAL TELECOMMUNICATION UNION )454 6 TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU $!4! #/--5.)#!4)/. /6%2 4(% 4%,%(/.%.%47/2+ 39.#(2/./53 $!4! 42!.3-)33)/.!4! $!4! 3)'.!,,).' 2!4% ()'(%2 4(!.
More informationSource-filter Analysis of Consonants: Nasals and Laterals
L105/205 Phonetics Scarborough Handout 11 Nov. 3, 2005 reading: Johnson Ch. 9 (today); Pickett Ch. 5 (Tues.) Source-filter Analysis of Consonants: Nasals and Laterals 1. Both nasals and laterals have voicing
More informationSignal Processing for Speech Applications - Part 2-1. Signal Processing For Speech Applications - Part 2
Signal Processing for Speech Applications - Part 2-1 Signal Processing For Speech Applications - Part 2 May 14, 2013 Signal Processing for Speech Applications - Part 2-2 References Huang et al., Chapter
More informationEE 225D LECTURE ON SYNTHETIC AUDIO. University of California Berkeley
University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Synthetic Audio Spring,1999 Lecture 2 N.MORGAN
More informationApplication of The Wavelet Transform In The Processing of Musical Signals
EE678 WAVELETS APPLICATION ASSIGNMENT 1 Application of The Wavelet Transform In The Processing of Musical Signals Group Members: Anshul Saxena anshuls@ee.iitb.ac.in 01d07027 Sanjay Kumar skumar@ee.iitb.ac.in
More informationChanging the pitch of the oscillator. high pitch as low as possible, until. What can we do with low pitches?
The basic premise is that everything is happening between the power supply and the speaker A Changing the pitch of the oscillator lowest pitch 60 sec! as high as possible, then stay there high pitch as
More informationDetermination of instants of significant excitation in speech using Hilbert envelope and group delay function
Determination of instants of significant excitation in speech using Hilbert envelope and group delay function by K. Sreenivasa Rao, S. R. M. Prasanna, B.Yegnanarayana in IEEE Signal Processing Letters,
More informationReal-Time Digital Hardware Pitch Detector
2 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-24, NO. 1, FEBRUARY 1976 Real-Time Digital Hardware Pitch Detector JOHN J. DUBNOWSKI, RONALD W. SCHAFER, SENIOR MEMBER, IEEE,
More informationGeneral outline of HF digital radiotelephone systems
Rec. ITU-R F.111-1 1 RECOMMENDATION ITU-R F.111-1* DIGITIZED SPEECH TRANSMISSIONS FOR SYSTEMS OPERATING BELOW ABOUT 30 MHz (Question ITU-R 164/9) Rec. ITU-R F.111-1 (1994-1995) The ITU Radiocommunication
More informationMultiple Sound Sources Localization Using Energetic Analysis Method
VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova
More informationLecture 3 Concepts for the Data Communications and Computer Interconnection
Lecture 3 Concepts for the Data Communications and Computer Interconnection Aim: overview of existing methods and techniques Terms used: -Data entities conveying meaning (of information) -Signals data
More informationTHREE-AXIS MORPHING WITH NONLINEAR WAVESHAPERS FREQUENCY +/- 8V SELECT FM/EXT IN AC 10VPP OSC A LINK FREQUENCY MODE SELECT OSC B CV +/- 8V MICRO SD
PISTON HONDA DUAL WAVETABLE OSCILLATOR THREE-AXIS MORPHING WITH NONLINEAR WAVESHAPERS FREQUENCY SYN C 0-5V MODE SELECT CV +/- 8V PRESET/EDIT 1V/OCT 0-8V CV +/- 8V FM/EXT IN AC 10VPP OSC A LINK FREQUENCY
More informationGLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES
Clemson University TigerPrints All Dissertations Dissertations 5-2012 GLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES Yiqiao Chen Clemson University, rls_lms@yahoo.com
More informationDELTA MODULATION. PREPARATION principle of operation slope overload and granularity...124
DELTA MODULATION PREPARATION...122 principle of operation...122 block diagram...122 step size calculation...124 slope overload and granularity...124 slope overload...124 granular noise...125 noise and
More informationModule 3: Physical Layer
Module 3: Physical Layer Dr. Associate Professor of Computer Science Jackson State University Jackson, MS 39217 Phone: 601-979-3661 E-mail: natarajan.meghanathan@jsums.edu 1 Topics 3.1 Signal Levels: Baud
More informationSubtractive Synthesis & Formant Synthesis
Subtractive Synthesis & Formant Synthesis Prof Eduardo R Miranda Varèse-Gastprofessor eduardo.miranda@btinternet.com Electronic Music Studio TU Berlin Institute of Communications Research http://www.kgw.tu-berlin.de/
More informationReference Manual SPECTRUM. Signal Processing for Experimental Chemistry Teaching and Research / University of Maryland
Reference Manual SPECTRUM Signal Processing for Experimental Chemistry Teaching and Research / University of Maryland Version 1.1, Dec, 1990. 1988, 1989 T. C. O Haver The File Menu New Generates synthetic
More informationData Communications & Computer Networks
Data Communications & Computer Networks Chapter 3 Data Transmission Fall 2008 Agenda Terminology and basic concepts Analog and Digital Data Transmission Transmission impairments Channel capacity Home Exercises
More informationLand and Coast Station Transmitters Operating in the Band khz
Issue 3 January 2016 Spectrum Management Radio Standards Specification Land and Coast Station Transmitters Operating in the Band 200-535 khz Aussi disponible en français CNR-117 Preface Radio Standards
More informationSpeech/Non-speech detection Rule-based method using log energy and zero crossing rate
Digital Speech Processing- Lecture 14A Algorithms for Speech Processing Speech Processing Algorithms Speech/Non-speech detection Rule-based method using log energy and zero crossing rate Single speech
More informationSynchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech
INTERSPEECH 5 Synchronous Overlap and Add of Spectra for Enhancement of Excitation in Artificial Bandwidth Extension of Speech M. A. Tuğtekin Turan and Engin Erzin Multimedia, Vision and Graphics Laboratory,
More informationFundamentals of Digital Audio *
Digital Media The material in this handout is excerpted from Digital Media Curriculum Primer a work written by Dr. Yue-Ling Wong (ylwong@wfu.edu), Department of Computer Science and Department of Art,
More informationResults of Egan and Hake using a single sinusoidal masker [reprinted with permission from J. Acoust. Soc. Am. 22, 622 (1950)].
XVI. SIGNAL DETECTION BY HUMAN OBSERVERS Prof. J. A. Swets Prof. D. M. Green Linda E. Branneman P. D. Donahue Susan T. Sewall A. MASKING WITH TWO CONTINUOUS TONES One of the earliest studies in the modern
More informationOn the glottal flow derivative waveform and its properties
COMPUTER SCIENCE DEPARTMENT UNIVERSITY OF CRETE On the glottal flow derivative waveform and its properties A time/frequency study George P. Kafentzis Bachelor s Dissertation 29/2/2008 Supervisor: Yannis
More information