ESE150 Spring University of Pennsylvania Department of Electrical and System Engineering Digital Audio Basics

Similar documents
Laboratory Assignment 2 Signal Sampling, Manipulation, and Playback

Fundamentals of Digital Audio *

Assistant Lecturer Sama S. Samaan

MUSC 316 Sound & Digital Audio Basics Worksheet

INDIANA UNIVERSITY, DEPT. OF PHYSICS P105, Basic Physics of Sound, Spring 2010

Final Exam Study Guide: Introduction to Computer Music Course Staff April 24, 2015

MUS 302 ENGINEERING SECTION

2: Audio Basics. Audio Basics. Mark Handley

Chapter 3 Data and Signals 3.1

Chapter 4. Digital Audio Representation CS 3570

Perception of pitch. Importance of pitch: 2. mother hemp horse. scold. Definitions. Why is pitch important? AUDL4007: 11 Feb A. Faulkner.

JOU4308: Magazine & Feature Writing

SGN Audio and Speech Processing

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Experiment # 2 Pulse Code Modulation: Uniform and Non-Uniform

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Introduction to Communications Part Two: Physical Layer Ch3: Data & Signals

ECE 556 BASICS OF DIGITAL SPEECH PROCESSING. Assıst.Prof.Dr. Selma ÖZAYDIN Spring Term-2017 Lecture 2

EE 309 Signal and Linear System Analysis

Fundamentals of Data and Signals

CHAPTER 4. PULSE MODULATION Part 2

Making Connections Efficient: Multiplexing and Compression

Experiment # 2. Pulse Code Modulation: Uniform and Non-Uniform

TEAK Sound and Music

EE 403: Digital Signal Processing

Communications I (ELCN 306)

Audio Quality Terminology

2.1. General Purpose Run Length Encoding Relative Encoding Tokanization or Pattern Substitution

CSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued

Computer Audio. An Overview. (Material freely adapted from sources far too numerous to mention )

Syllabus for TVF 318 Fundamentals of Scriptwriting 3 Credit Hours Fall 2014

ENGR 4323/5323 Digital and Analog Communication

In this lecture. System Model Power Penalty Analog transmission Digital transmission

Spring 06 Assignment 2: Constraint Satisfaction Problems

Digital Speech Processing and Coding

Speech Enhancement Based On Spectral Subtraction For Speech Recognition System With Dpcm

MULTIMEDIA SYSTEMS

SGN Audio and Speech Processing

DIGITAL PHOTOGRAPHY MASS MEDIA 4321 SPRING 2017

ADVANCED DIGITAL PHOTOGRAPHY MASS MEDIA 4321 SPRING 2018

Chapter Two. Fundamentals of Data and Signals. Data Communications and Computer Networks: A Business User's Approach Seventh Edition

Physical Layer: Outline

ALTERNATING CURRENT (AC)

Sampling and Reconstruction of Analog Signals

CSCD 433 Network Programming Fall Lecture 5 Physical Layer Continued

CS Lecture 10:

NALA ATSI SOUND ENGINEERING SCHOOL

Set-up. Equipment required: Your issued Laptop MATLAB ( if you don t already have it on your laptop)

Audio Signal Compression using DCT and LPC Techniques

An introduction to physics of Sound

Lab 4: Using the CODEC

Type pwd on Unix did on Windows (followed by Return) at the Octave prompt to see the full path of Octave's working directory.

Part IV: Glossary of Terms

Photography COMM 1316 SUMMER 2017

The Equalization Primer (The Complete Lesson On Getting Started With EQ) by Robert Dennis

Engineering Scope and Sequence Student Outcomes (Objectives Skills/Verbs)

Continuous vs. Discrete signals. Sampling. Analog to Digital Conversion. CMPT 368: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals

2. Pre-requisites - CGS 2425 and MAC 2313; Corequisite - MAP 2302 and one of: EEL 3105, MAS 3114 or MAS 4105

E40M Sound and Music. M. Horowitz, J. Plummer, R. Howe 1

Contents. Telecom Service Chae Y. Lee. Data Signal Transmission Transmission Impairments Channel Capacity

University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics

TCET3202 Analog and digital Communications II

Waveform Encoding - PCM. BY: Dr.AHMED ALKHAYYAT. Chapter Two

Howard Hall Office Hours: T 11:00-12:15; W 11:30-1:00; TH 8:15-9:15; 11:00-12:15

Chapter 8. Representing Multimedia Digitally

A102 Signals and Systems for Hearing and Speech: Final exam answers

EQ s & Frequency Processing

Voice Transmission --Basic Concepts--

In this course students will continue with their studies of keyboard technique, harmonization, improvisation, sight reading and solo repertoire.

Pulse Code Modulation

AUDITORY ILLUSIONS & LAB REPORT FORM

EE422G Solution to Homework #8

Beginner Oil Painting

This presentation is on Avoiding Plagiarism in your academic writing. It has been designed by the Robert

PHOTOGRAPHY II SYLLABUS. SAMPLE SYLLABUS COURSE: AR320 Photography II NUMBER OF CREDIT HOURS: 3 PREREQUISITE: AR120

Physics 101. Lecture 21 Doppler Effect Loudness Human Hearing Interference of Sound Waves Reflection & Refraction of Sound

EC 6501 DIGITAL COMMUNICATION UNIT - II PART A

Music 270a: Fundamentals of Digital Audio and Discrete-Time Signals

! Where are we on course map? ! What we did in lab last week. " How it relates to this week. ! Sampling/Quantization Review

MAS160: Signals, Systems & Information for Media Technology. Problem Set 4. DUE: October 20, 2003

Sound/Audio. Slides courtesy of Tay Vaughan Making Multimedia Work

Chapter 3. Communication and Data Communications Table of Contents

Sec Element standards. (1) Element 1: 5 words per minute

12: PRELAB: INTERFERENCE

Communication Theory II

Worship Sound Guy Presents: Ultimate Compression Cheat Sheet

Pulse Code Modulation

CS 3570 Chapter 5. Digital Audio Processing

CSE 166: Image Processing. Overview. What is an image? Representing an image. What is image processing? History. Today

Copyright 2017 by Kevin de Wit

ITM 1010 Computer and Communication Technologies

Preeti Rao 2 nd CompMusicWorkshop, Istanbul 2012

CMPT 318: Lecture 4 Fundamentals of Digital Audio, Discrete-Time Signals

EE (3L-1.5P) Analog Electronics Department of Electrical and Computer Engineering Fall 2015

THE SPEAKER. The decibel scale is related to the physical sound intensity measured in watts/cm 2 by the following equation:

E40M Sound and Music. M. Horowitz, J. Plummer, R. Howe 1

Psychology of Language

CHAPTER 2 - DIGITAL DATA REPRESENTATION AND NUMBERING SYSTEMS

Chapter 2: Digitization of Sound

ENSC327/328 Communication Systems Course Information. Paul Ho Professor School of Engineering Science Simon Fraser University

Transcription:

University of Pennsylvania Department of Electrical and System Engineering Digital Audio Basics ESE150, Spring 2018 Midterm Wednesday, February 28 Exam ends at 5:50pm; begin as instructed (target 4:35pm) Problems weighted as shown. Calculators allowed. Closed book = No text or notes allowed. Provided reference materials on next to last page. Show work for partial credit consideration. Unless otherwise noted, answers to two significant figures are sufficient. Sign Code of Academic Integrity statement (see last page for code). I certify that I have complied with the University of Pennsylvania s Code of Academic Integrity in completing this exam. Name: Solution 1 2 3 4 5 6 7 8 9 10 Total 10 10 10 10 10 10 10 10 10 10 100 Average: 79, Std. Dev.: 18 1

1. Time-Domain data samples: (a) How many bits are required to encode a single (mono) track of 4 minutes of 44KHz sample audio with 16b samples? [4 points] 4 60 44, 000 16 = 170, 000, 000 = 1.7 10 8 (b) If we reduce the sample rate to 32KHz, how much will we also need to reduce the per sample quantization to halve the bits required for encoding? [3 points] 32,000 44,000 α = 1 2 α = 0.69, or quantization must be reduced to α 16 = 11 bits (c) What kind of compression is this and why? [3 points] Lossy. We are discarding information in the quantization (losing the distinction among 2 5 different values and losing the ability to accurately capture frequencies between 16KHz and 22KHz. -1 if not identify what information is lost 2

2. For the following samples of a sine wave: 1 0.5 V 0-0.5-1 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 time (ms) (a) What is the frequency of the sine wave? [5 points] 350Hz -1 for approximating frequency -1 for misinterpretting and getting wrong magnitude (e.g. ms vs. s) (b) What is the sample rate? [5 points] 10KHz (from 10 samples per ms) -1 for error in last step of calculation -1 for misinterpretting and getting wrong magnitude (e.g. ms vs. s) 3

3. Sample period and frequencies: The time for analogread on the arduino is 125µs. Based on this, what is the upper bound for the achievable sample rate on the Arduino? [3 points] 1 0.000125 =8KHz What is the upper bound on the highest frequency the Ardunio sampling can accurately capture? [4 points] 4KHz Nyquist frequency = half the sample rate Assuming no external filtering, what happens to a 6000 Hz tone? [3 points] It becomes aliased to a lower frequency. Specifically, it will show up as a 2KHz tone. -1 if get aliasing but not identify where/how the aliased signal ends up. 4

4. Categorize the following as lossy or lossless: (a) storing (frequency, amplitude, phase) triples for the non-zero frequency elements [2 points] lossless can reconstruct by infering the non-stored cases are zero (b) starting with 16b time-sampled data, and converting to recording of (time,new amplitude) when changes occur [2 points] lossless can reconsruct the waveform holding the value constant between changes (c) starting with 16b samples, add adjacent sample pairs and storing a single 17b value for each original pair of 16b samples [2 points] lossy not enough information to uniquely restore the two original pairs (d) starting with 16b time-sampled data and converting to store, for each sample, the difference relative to the value of the previous time sample. [2 points] lossless can recover the original by integrated the values (summing together the sequene of changes) (e) reporting all answers to 2 significant figures [2 points] lossy cannot recover lower order figures 5

5. Which encoding uses the fewest bits to encode this quote and why? t h e b e s t i s y e t t o c o m e [Correct choice 5 points; Mention common case short encdoing 5 points] C make common case inexpensive; C gives the commonly occurring symbols short encodings, while allowing less common symbols to take have longer encodings. (B also makes more frequent cases shorter, but doesn t optimally assign lengths based on frequency; it gives some symbols too short an encoding, forcing too many symbols to be longer than necessary. Technically, C uses encodings close to the Shannon optimal length of log(p).) A: 4 23 = 92 B: 1 5 + 10 1 + 9 1 + 3 4 + 7 1 + 6 1 + 10 1 + 5 2 + 4 2 + 2 4 + 10 1 = 95 C: 2 5 + 5 1 + 5 1 + 3 4 + 5 1 + 5 1 + 5 1 + 3 2 + 4 2 + 2 4 + 5 1 = 74 D: 5 5 + 3 1 + 3 1 + 5 4 + 3 1 + 3 1 + 3 1 + 5 2 + 5 2 + 5 4 + 3 1 = 103 symbol A B C D (space) 0000 0 01 11111 b 0001 00000000001 10100 000 c 0010 000000001 10101 001 e 0011 001 111 11101 h 0100 00000001 11010 010 i 0101 0000001 11011 011 m 0110 0000000001 11000 100 o 0111 00001 100 1100 s 1000 0001 1011 1101 t 1001 01 00 11100 y 1011 0000000000 11001 101 As given had two identical encodings. One should have had another 0 as shown. Did not effect choice of correct answer. You could either ignore the fact two were same and just trust lengths, or you could make one of those longer. 6

6. Given: f(t) = 0.5 cos(2π 800t) + sin(2π 1000t) give the first 5 time-sample values of f(t) for a 4KHz sample rate. [per sample 2 points] sample value 0 0.5 cos(2π 800 0.00025 0) + sin(2π 1000 0.00025 0) =0.5+0=0.5 1 0.5 cos(2π 800 0.00025 1) + sin(2π 1000 0.00025 1) =0.15+1=1.15 2 0.5 cos(2π 800 0.00025 2) + sin(2π 1000 0.00025 2) =-0.40+0=-0.40 3 0.5 cos(2π 800 0.00025 3) + sin(2π 1000 0.00025 3) =-0.4-1=-1.4 4 0.5 cos(2π 800 0.00025 4) + sin(2π 1000 0.00025 4) = 0.15+0=0.15 partial credit if show calculation and one of two components is correct. 7

7. Sound Perception (a) Assuming the following frequency components exist simultaneously, which has the least effect on perceived sound quality and why? [5 points] i. amplitude 1, frequency 1500 ii. amplitude 0.3, frequency 1400 iii. amplitude 0.3, frequency 1600 1600 Hz tone. This is in the same critical band as the dominant, 1500 Hz so likely to be masked. The 1400 Hz tone is in a different critical band, so will not be masked. (b) Assuming the following tones all occur at 40dB, which will sound the loudest? [5 points] i. 100 Hz ii. 1,000 Hz iii. 10,000 Hz 1,000Hz human hearing is most sensitive here. The lower and higher frequencies will be perceived as less loud. 8

8. In music video games (e.g., RockBand, Karaoke, or Guitar Hero), a singer earns points by matching the tune for a lyric track. (a) Using what you know from this course, how can the game process recorded sound input to identify how well the singer is performing? (quality of singing = ability to sing the right notes at the right time) [6 points] Perform DFT on each time window to identify the frequencies at each point in time. Score based on match of frequency at time. -2 for trying to do this in time domain -1 for only trying to match to critical band (b) The singer will typically be singing along with background instrument tracks played by the game. The sound from these tracks will also be picked up by the microphone into which the singer sings. How can the game cope with the composite sound that includes both the background instruments and the singers input? [4 points] Subtract out the expected frequencies and amplitudes for the instruments. Ideally, design the accompanyment so the instrument frequencies do not overlap with the intended vocal frequencies, so subtracting them out will not effect the singer frequencies. -1 if simply assume voice is louder than instruments 9

9. Compare an SMS text message to cell phone audio. Assume a single SMS text message is 160 characters, where each character is 8b ASCII. Assume the 160 character message is equivalent to 2 seconds of spoken sound. Telephone quality audio is 8b samples at 8KHz (a) How much more compact is the SMS text message (ratio of bits required)? [6 points] 2 8000 8 8 160 = 100; The SMS text message uses only 1% of the bits required by the telephone quality audio sample. (b) What information is lost when you substitute the SMS text message for the cell phone audio? [4 points] Timing, speed, pauses; voice (speaker recognition); emotional intent (happy, sad, angry, humorous,...) 10

10. Two of your friend both recorded a live historic speech (e.g., Jason Kelce in front of the Art Museum earlier this month?). During a key point in the speech, the person next to them yells loudly (40dB above the speech) around 1500 Hz. One friend is recording raw, PCM samples at CD-quality (16b, 44KHz) The other is recording directly to an MP3 (a) For the CD-quality recording [5 points] i. Can you repair it? (remove the loud noise so listerners can hear the entire speech, including the key point when the person is yelling) yes ii. If not, why not? If you can, outline how? Take DFT; identify frequency components for the yell and subtract them out. Leave the rest of the frequencies, particularly the speech, alone. Take inverse-dft to convert back to PCM samples if appropriate (or store as DFT samples). -2 if not say how subtract out; -1 if try do in analog level rather than frequency domain (b) For the MP3-encoded recording [5 points] i. Can you repair it? (see above) no (or maybe, but not as well) ii. If not, why not? If you can, outline how? No: During MP3 encoding, the yell will mask softer sound in the band. The MP3 encoder will remove these other frequencies during encoding since they cannot be heard. The information is lost. Maybe: You could do the same thing as for the CDquality encoding remove the yell frequencies. You would only lose the frequencies in the same critical band as the yell. So, it should result in a better recording, but may lose sounds than the CD-quality recording does not lose. -2 if not identify masking 11

This page intentionally left mostly blank for pagination. Feel free to use for work space. 12

Human auditory critical bands: Band Number Low High 1 20 100 2 100 200 3 200 300 4 300 400 5 400 510 6 510 630 7 630 720 8 720 920 9 920 1080 10 1080 1370 11 1270 1480 12 1480 1720 13 1720 2000 14 2000 2320 15 2320 2700 16 2700 3150 17 3150 3700 18 3700 4400 19 4400 5300 20 5300 6400 21 6400 7700 22 7700 9500 23 9500 12000 24 12000 15500 13

Code of Academic Integrity Since the University is an academic community, its fundamental purpose is the pursuit of knowledge. Essential to the success of this educational mission is a commitment to the principles of academic integrity. Every member of the University community is responsible for upholding the highest standards of honesty at all times. Students, as members of the community, are also responsible for adhering to the principles and spirit of the following Code of Academic Integrity.* Academic Dishonesty Definitions Activities that have the effect or intention of interfering with education, pursuit of knowledge, or fair evaluation of a students performance are prohibited. Examples of such activities include but are not limited to the following definitions: A. Cheating Using or attempting to use unauthorized assistance, material, or study aids in examinations or other academic work or preventing, or attempting to prevent, another from using authorized assistance, material, or study aids. Example: using a cheat sheet in a quiz or exam, altering a graded exam and resubmitting it for a better grade, etc. B. Plagiarism Using the ideas, data, or language of another without specific or proper acknowledgment. Example: copying another persons paper, article, or computer work and submitting it for an assignment, cloning someone elses ideas without attribution, failing to use quotation marks where appropriate, etc. C. Fabrication Submitting contrived or altered information in any academic exercise. Example: making up data for an experiment, fudging data, citing nonexistent articles, contriving sources, etc. D. Multiple Submissions Multiple submissions: submitting, without prior permission, any work submitted to fulfill another academic requirement. E. Misrepresentation of academic records Misrepresentation of academic records: misrepresenting or tampering with or attempting to tamper with any portion of a students transcripts or academic record, either before or after coming to the University of Pennsylvania. Example: forging a change of grade slip, tampering with computer records, falsifying academic information on ones resume, etc. F. Facilitating Academic Dishonesty Knowingly helping or attempting to help another violate any provision of the Code. Example: working together on a take-home exam, etc. G. Unfair Advantage Attempting to gain unauthorized advantage over fellow students in an academic exercise. Example: gaining or providing unauthorized access to examination materials, obstructing or interfering with another students efforts in an academic exercise, lying about a need for an extension for an exam or paper, continuing to write even when time is up during an exam, destroying or keeping library materials for ones own use., etc. * If a student is unsure whether his action(s) constitute a violation of the Code of Academic Integrity, then it is that students responsibility to consult with the instructor to clarify any ambiguities. 14