JOURNAL OF OBJECT TECHNOLOGY

Similar documents
Chapter 2. Meeting 2, Measures and Visualizations of Sounds and Signals

Signals & Systems for Speech & Hearing. Week 6. Practical spectral analysis. Bandpass filters & filterbanks. Try this out on an old friend

Discrete Fourier Transform (DFT)

Acoustics, signals & systems for audiology. Week 4. Signals through Systems

Waves & Interference

SPEECH AND SPECTRAL ANALYSIS

8.3 Basic Parameters for Audio

EE 464 Short-Time Fourier Transform Fall and Spectrogram. Many signals of importance have spectral content that

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time.

Complex Sounds. Reading: Yost Ch. 4

Acoustics and Fourier Transform Physics Advanced Physics Lab - Summer 2018 Don Heiman, Northeastern University, 1/12/2018

THE CITADEL THE MILITARY COLLEGE OF SOUTH CAROLINA. Department of Electrical and Computer Engineering. ELEC 423 Digital Signal Processing

PART I: The questions in Part I refer to the aliasing portion of the procedure as outlined in the lab manual.

Since the advent of the sine wave oscillator

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

CS 591 S1 Midterm Exam

VIBRATO DETECTING ALGORITHM IN REAL TIME. Minhao Zhang, Xinzhao Liu. University of Rochester Department of Electrical and Computer Engineering

Frequency Domain Representation of Signals

Chapter 16 Sound. Copyright 2009 Pearson Education, Inc.

Chapter 4. Digital Audio Representation CS 3570

Interpolation Error in Waveform Table Lookup

ECEn 487 Digital Signal Processing Laboratory. Lab 3 FFT-based Spectrum Analyzer

Linguistic Phonetics. Spectral Analysis

Topic 6. The Digital Fourier Transform. (Based, in part, on The Scientist and Engineer's Guide to Digital Signal Processing by Steven Smith)

Pitch Detection Algorithms

Lab 3 FFT based Spectrum Analyzer

FFT analysis in practice

Performing the Spectrogram on the DSP Shield

SAMPLING THEORY. Representing continuous signals with discrete numbers

New Features of IEEE Std Digitizing Waveform Recorders

Reading: Johnson Ch , Ch.5.5 (today); Liljencrants & Lindblom; Stevens (Tues) reminder: no class on Thursday.

University of Colorado at Boulder ECEN 4/5532. Lab 1 Lab report due on February 2, 2015

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. Audio DSP basics. Paris Smaragdis. paris.cs.illinois.

Signal segmentation and waveform characterization. Biosignal processing, S Autumn 2012

Variations in Waveforms and Energy Spectra between Musical Instruments

Speech/Music Change Point Detection using Sonogram and AANN

Since it s a long and technical article (11k words) feel free to read each part at different times.

Sound Synthesis Methods

ENGR 210 Lab 12: Sampling and Aliasing

EE 422G - Signals and Systems Laboratory

Topic 2. Signal Processing Review. (Some slides are adapted from Bryan Pardo s course slides on Machine Perception of Music)

Advanced Audiovisual Processing Expected Background

Introduction. Chapter Time-Varying Signals

6 Sampling. Sampling. The principles of sampling, especially the benefits of coherent sampling

Fourier Methods of Spectral Estimation

8A. ANALYSIS OF COMPLEX SOUNDS. Amplitude, loudness, and decibels

MUSC 316 Sound & Digital Audio Basics Worksheet

AP Physics B (Princeton 15 & Giancoli 11 & 12) Waves and Sound

Developing a Versatile Audio Synthesizer TJHSST Senior Research Project Computer Systems Lab

DCSP-10: DFT and PSD. Jianfeng Feng. Department of Computer Science Warwick Univ., UK

Music 171: Amplitude Modulation

ECE 440L. Experiment 1: Signals and Noise (1 week)

Biomedical Signals. Signals and Images in Medicine Dr Nabeel Anwar

Forced Oscillation Detection Fundamentals Fundamentals of Forced Oscillation Detection

ME scope Application Note 01 The FFT, Leakage, and Windowing

Copper Pipe Xylophone

Audio processing methods on marine mammal vocalizations

Fundamentals of Digital Audio *

Fundamentals of Music Technology

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 13 Timbre / Tone quality I

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Use Matlab Function pwelch to Find Power Spectral Density or Do It Yourself

Project 0: Part 2 A second hands-on lab on Speech Processing Frequency-domain processing

SGN Audio and Speech Processing

Modern spectral analysis of non-stationary signals in power electronics

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

speech signal S(n). This involves a transformation of S(n) into another signal or a set of signals

2015 HBM ncode Products User Group Meeting

Computer Audio. An Overview. (Material freely adapted from sources far too numerous to mention )

Principles of Musical Acoustics

Rhythmic Similarity -- a quick paper review. Presented by: Shi Yong March 15, 2007 Music Technology, McGill University

PHYSICS LAB. Sound. Date: GRADE: PHYSICS DEPARTMENT JAMES MADISON UNIVERSITY

ICT Elementary for Embedded Systems Signal/Electronic Fundamental. Fourier Transform and Communication Systems. Asst. Prof. Dr.

Comparison of a Pleasant and Unpleasant Sound

14 fasttest. Multitone Audio Analyzer. Multitone and Synchronous FFT Concepts

Noise estimation and power spectrum analysis using different window techniques

LAB #7: Digital Signal Processing

Spectrum Analysis - Elektronikpraktikum

From Ladefoged EAP, p. 11

Butterworth Window for Power Spectral Density Estimation

System analysis and signal processing

Introduction. Physics 1CL WAVES AND SOUND FALL 2009

FFT 1 /n octave analysis wavelet

Introduction of Audio and Music

Physics 101. Lecture 21 Doppler Effect Loudness Human Hearing Interference of Sound Waves Reflection & Refraction of Sound

Response spectrum Time history Power Spectral Density, PSD

Real-time fundamental frequency estimation by least-square fitting. IEEE Transactions on Speech and Audio Processing, 1997, v. 5 n. 2, p.

Standard Octaves and Sound Pressure. The superposition of several independent sound sources produces multifrequency noise: i=1

Unit 6: Waves and Sound

Outline. Introduction to Biosignal Processing. Overview of Signals. Measurement Systems. -Filtering -Acquisition Systems (Quantisation and Sampling)

Coming to Grips with the Frequency Domain

MUS421/EE367B Applications Lecture 9C: Time Scale Modification (TSM) and Frequency Scaling/Shifting

Signal Processing Toolbox

Signal Processing First Lab 20: Extracting Frequencies of Musical Tones

Encoding a Hidden Digital Signature onto an Audio Signal Using Psychoacoustic Masking

Acoustic Phonetics. How speech sounds are physically represented. Chapters 12 and 13

Musical Acoustics, C. Bertulani. Musical Acoustics. Lecture 14 Timbre / Tone quality II

Unit 6: Waves and Sound

Transcription:

JOURNAL OF OBJECT TECHNOLOGY Online at http://www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2009 Vol. 9, No. 1, January-February 2010 The Discrete Fourier Transform, Part 5: Spectrogram By Douglas Lyon Abstract This paper is part 5 in a series of papers about the Discrete Fourier Transform (DFT) and the Inverse Discrete Fourier Transform (IDFT). The focus of this paper is on the spectrogram. The spectrogram performs a Short-Time Fourier Transform (STFT) in order to estimate the spectrum of a signal as a function of time. The approach requires that each time segment be transformed into the frequency domain after it is windowed. Overlapping windows temporally isolate the signal by amplitude modulation with an apodizing function. The selection of overlap parameters is done on an ad-hoc basis, as is the apodizing function selection. This report is a part of project Fenestratus, from the skunk-works of DocJava, Inc. Fenestratus comes from the Latin and means, to furnish with windows. 1 INTRODUCTION TO SHORT-TIME SPECTRAL ESTIMATION A spectrogram (sometimes called a spectral water fall, sonogram, voiceprint or voice gram) is a visual rendering of the harmonics of an input signal as a function of time. From an implementation point-of-view, this means windowing an input signal, taking the Fourier transform and then displaying it. The window is translated in time across the signal. Window size, sample overlap, the sample rate, number of bits per-sample, etc., are application-specific design parameters. Suppose, for example, we are given a turning fork that produces a tone at 440 Hz. To compute the period of the waveform, in milliseconds, we divide 1000 by 440 Hz to obtain 2.27 milliseconds. The capture/display oscilloscope (written in Java) displays the output in Figure 1. Douglas A. Lyon: The Discrete Fourier Transform, Part 5: Spectrogram, in Journal of Object Technology, vol. 9. no. 1, January - February 2010 pp. 15-24 http://www.jot.fm/issues/issue_2010_01/column2/

THE DISCRETE FOURIER TRANSFORM, PART 5: SPECTROGRAM Figure 1. The Tuning Fork It is clear from Figure 1 that the tuning fork waveform is not sinusoidal. There are several possible reasons for this. The tuning fork is in direct contact with the microphone (it is not very loud), there is some ambient noise in the input, the digitization is only 8 bits and is using u-law coding, finally, the tuning fork waveform may not sinusoidal. Figure 2. The Spectrogram Display Control Figure 2 shows the spectrogram display control panel. It enables the user to select a Power Spectral Density (PSD), the real part or imaginary part of the FFT or Normal. 16 JOURNAL OF OBJECT TECHNOLOGY VOL. 9, NO. 1.

Figure 3. The Spectrogram Control Panel Figure 3 shows the control over the windowing used for the spectrogram. The window functions are described in Part 4 of this series of articles [Lyon 09]. The window size is given in samples. The log scale allows for the visualization of the subtler harmonic content of the input signal. Figure 4. The Spectrogram of the Tuning Fork Without the log scale, the 440 Hz fork looks like it has 3rd harmonic content (440*3 = 1320 Hz). VOL. 9, NO. 1. JOURNAL OF OBJECT TECHNOLOGY 17

THE DISCRETE FOURIER TRANSFORM, PART 5: SPECTROGRAM Figure 5. The Spectrogram in Log Scale The log scale shows power at several harmonics (110, 220, 440, 880, 1320,and 1760). With a sample window of 256 samples and a spread of 4 KHz (at 8000 samples per second) each sample represents 4000/256 = 15.625 Hz. Further, this is not a pure tone, as the waveform suggests. To compensate, we use a 440 Hz sine wave, and compute the spectrogram. 18 JOURNAL OF OBJECT TECHNOLOGY VOL. 9, NO. 1.

Figure 6. Spectrogram of a 440 Hz Sine Wave Figure 6 shows a log display of the PSD for a 440 Hz sine wave with a rectangular window that has 256 samples and 50% overlap. The most powerful harmonic is said to be at 437 Hz (which is only 3 Hz away from the actual 440 Hz frequency). Also, because the windows is 256 sample with an 8 KHz sampling rate, the duration of the window is 1000*256/8000 = 32 ms. This gives a reasonable frequency and temporal resolution for our voice-grade audio analyzer. The less-than-pure tone tuning fork has its two strongest harmonics at 125 Hz and 437 Hz. The 4th harmonic down from the 440 Hz signal is 110 Hz, and 125 Hz - 110 Hz is 15 Hz, which is right at the limit of the size of our 15.6 Hz bucket. An increase in the size of the window (and length of the sample) should enable an improved frequency resolution, at the expense of temporal resolution. In our last test, we increase the window size to 1024 samples. The two strongest buckets are 437 Hz and 445 Hz. This is consistent with our quantization frequency (4000 samples per second / 1024 sample = 3.9 Hz). Considering this is an old, abused tuning fork, which may not be 440 Hz, our analyzer seems to be doing pretty well. However, 1024 is a 1 second sample, and most note events do not last that long. In fact, if you average the 437 and 445 buckets together, you get 441 Hz (which is really very close to 440 Hz). 2 TUNING A GUITAR Guitars are an interesting instrument for tonal identification. They can have rich harmonics (i.e., you can play chords) and they can be monophonic (i.e., they can be played just one note at a time) and they are easily tuned. I tune my guitar using the VOL. 9, NO. 1. JOURNAL OF OBJECT TECHNOLOGY 19

THE DISCRETE FOURIER TRANSFORM, PART 5: SPECTROGRAM (440 Hz) tuning fork on the A string. The rest of the guitar is tuned by ear, using the fret board for relative tuning. We take the approach of playing open strings, using a 1024 sample window with 10% overlap and compare our guitar with the published turning standard for guitars [Vaughns]. Note Name Frequency Measured E 82 78 A 110 109 D 146 148 G 196 195 B 247 250 E 330 328 Figure 7. Standard vs. Measured Frequencies Figure 7 shows the standard frequencies vs. the measured frequencies, for several note names. The nature of the guitar makes it rich in harmonics. For example, a second harmonic of G (390 Hz) is sometimes dominant in the output. Considering our predicted bucket size is 3.9 Hz, Figure 7 shows the measure frequencies to be within the bucket spread. We define the bucket spread as the center frequency +-1 bucket frequency quanta. So, 82 Hz +- 3.9 Hz is a range from 78 Hz to 86 Hz. Clearly, to reduce the bucket spread, we need more samples (either longer events or faster sample rate or both). 3 A MAJOR SCALE In this section we play a major scale (doe re me fa sol la ti doe) and see how our analysis is able to do, based on the spectrogram. For this experiment, we use a window of 256 samples and so expect a bucket quantum of 4000/256 = 15.6 Hz. Thus showing what happens when the number of samples is too small. Note Name Frequency Measured C 261 250 D 294 281 E 329 343 F 349 343 G 392 406 A 440 437 B 493 500 C 523 531 Figure 8. Specified and Measured Frequencies Figure 8 shows the specified and measure frequencies. Note that E and F are only 20 Hz away from one another. The mid-point frequency is only 10 Hz apart. It is clear we need more samples, or another algorithm to determine the strongest harmonic. 20 JOURNAL OF OBJECT TECHNOLOGY VOL. 9, NO. 1.

Figure 9. Spectrogram of a Major Scale Figure 9 shows the 256-sample window spectrogram. This demonstrates that, to the human eye, the E and F notes (3rd and 4th notes) are very close to one another on the graph and thus hard to tell apart. 4 INCREASING THE NUMBER OF SAMPLES There are two ways to increase the number of samples for the signal, either increase the sample rate or increase the signal duration. In this section, we increase the signal duration to over 1 second per note, and then increase the window size to 1024 samples, with 10% overlap and a rectangular window function. VOL. 9, NO. 1. JOURNAL OF OBJECT TECHNOLOGY 21

THE DISCRETE FOURIER TRANSFORM, PART 5: SPECTROGRAM Figure 10. Long Duration Notes Figure 10 shows the spectrogram for a guitar playing the major scale, in the key of C, with notes that last for longer than 1 second. Note Name Frequency 256 Wind 1024 Wind C 261 250 257 D 294 281 289 E 329 343 335 F 349 343 351 G 392 406 398 A 440 437 445 B 493 500 500 C 523 531 531 Figure 11. Comparison of 256 and 1024 Sample Windows Figure 11 shows the expected improvement in precision when the window (and signal) size increase. The ability to tell the difference between the E and the F notes has been greatly improved. The trade-off between time and frequency precision is also made far more clear. 5 SUMMARY This paper shows how Java can be used to drive home the lesson that temporal and frequency precision are an engineering trade-off that can drive system design. Using a short-time Fourier transform, we demonstrated, both numerically and graphically, 22 JOURNAL OF OBJECT TECHNOLOGY VOL. 9, NO. 1.

how window size, sample rate and duration impact our ability to identify pitch events on an input signal. Spectrograms are not new [Bartlett]. Nor, for that matter are the windows for doing harmonic analysis [Harris]. In fact the CMU Sphinx project has done this type of processing in the past [CMU]. However the Sphinx project used native methods. What is new is the use of Java (without any native methods) for performing this type of processing. Chris Lauer has a pure Java Sonogram project but this does not attempt to digitize directly from the microphone. Nor does it attempt to do note identification. There are several methods available for improving our ability to identify guitar notes. For example, each note has a unique harmonic signature. Presently, we only take the strongest harmonic. However, if we look at several harmonics, we could take advantage of individual note harmonic signatures. Extending the harmonic signature argument, we might do well to cross correlate the input signal with a bank of sampled notes, thus performing a kind of pattern recognition. The question of which approach is better remains open. REFERENCES [Bartlett] Bartlett, M. S. Periodogram Analysis and Continuous Spectra, Biometrika 37, 1-16, 1950. [CMU] http://cmusphinx.sourceforge.net Last accessed 8/15/09. [Harris] Fredric J. Harris, On the use of Windows for Harmonic Analysis with the Discrete Fourier Transform, Proceedings of the IEEE, V66N1, Jan. 1978, pp. 51-83. [Lauer] Chris Lauers Sonogram Project http://sourceforge.net/projects/sonogram/ Last accessed 8/14/09. [Lyon 09] Douglas Lyon, The Discrete Fourier Transform: Part 4 The Spectral Leakage, Journal of Object Technology, vol. 8, no. 7, November- December 2009, pp. 23-34. http://www.jot.fm/issues/issue_2009_11/column2/ [Vaughns] Musical Note Frequencies - Guitar and Piano http://www.vaughns-1- pagers.com/music/musical-note-frequencies.htm Last accessed 8/14/09. About the author Douglas A. Lyon (M'89-SM'00) received the Ph.D., M.S. and B.S. degrees in computer and systems engineering from Rensselaer Polytechnic Institute (1991, 1985 and 1983). Dr. Lyon has worked at AT&T Bell Laboratories at Murray Hill, NJ and the Jet Propulsion Laboratory at the California Institute of Technology, Pasadena, CA. VOL. 9, NO. 1. JOURNAL OF OBJECT TECHNOLOGY 23

THE DISCRETE FOURIER TRANSFORM, PART 5: SPECTROGRAM He is currently the co-director of the Electrical and Computer Engineering program at Fairfield University, in Fairfield CT, a senior member of the IEEE and President of DocJava, Inc., a consulting firm in Connecticut. Dr. Lyon has authored or co-authored three books (Java, Digital Signal Processing, Image Processing in Java and Java for Programmers). He has authored over 40 journal publications. Email: lyon@docjava.com. Web: http://www.docjava.com. 24 JOURNAL OF OBJECT TECHNOLOGY VOL. 9, NO. 1.