Chapter 6: Room Acoustics and 3D Sound Processing

Similar documents
Multichannel Audio In Cars (Tim Nind)

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1.

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Sound source localization and its use in multimedia applications

Envelopment and Small Room Acoustics

Binaural Hearing. Reading: Yost Ch. 12

LOW FREQUENCY SOUND IN ROOMS

Auditory Localization

From time to time it is useful even for an expert to give a thought to the basics of sound reproduction. For instance, what the stereo is all about?

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING

MUS 302 ENGINEERING SECTION

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

Introduction. 1.1 Surround sound

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques:

University of Huddersfield Repository

EQ s & Frequency Processing

Psychoacoustic Cues in Room Size Perception

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings.

Pre- and Post Ringing Of Impulse Response

Multi-Loudspeaker Reproduction: Surround Sound

RD75, RD50, RD40, RD28.1 Planar magnetic transducers with true line source characteristics

Spatial audio is a field that

University of Huddersfield Repository

Monitor Setup Guide The right monitors. The correct setup. Proper sound.

Sound Source Localization using HRTF database

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION

Multichannel Audio Technologies: Lecture 3.A. Mixing in 5.1 Surround Sound. Setup

SOUND 1 -- ACOUSTICS 1

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

Speech Compression. Application Scenarios

RECOMMENDATION ITU-R BS User requirements for audio coding systems for digital broadcasting

Accurate sound reproduction from two loudspeakers in a living room

Waves C360 SurroundComp. Software Audio Processor. User s Guide

The Why and How of With-Height Surround Sound

Loudspeaker Array Case Study

[Q] DEFINE AUDIO AMPLIFIER. STATE ITS TYPE. DRAW ITS FREQUENCY RESPONSE CURVE.

Josephson Engineering, Inc.

Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal

DC-1 Theory and Design

Reproduction of Surround Sound in Headphones

The psychoacoustics of reverberation

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

THE TEMPORAL and spectral structure of a sound signal

EE1.el3 (EEE1023): Electronics III. Acoustics lecture 20 Sound localisation. Dr Philip Jackson.

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ

A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology

What applications is a cardioid subwoofer configuration appropriate for?

SIA Software Company, Inc.

The analysis of multi-channel sound reproduction algorithms using HRTF data

6 TH GENERATION PROFESSIONAL SOUND FOR CONSUMER ELECTRONICS

Convention Paper 7480

ONLINE TUTORIALS. Log on using your username & password. (same as your ) Choose a category from menu. (ie: audio)

Introduction to Equalization

Understanding Sound System Design and Feedback Using (Ugh!) Math by Rick Frank

CHAPTER ONE SOUND BASICS. Nitec in Digital Audio & Video Production Institute of Technical Education, College West

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

Sound Processing Technologies for Realistic Sensations in Teleworking

Audio Engineering Society. Convention Paper. Presented at the 115th Convention 2003 October New York, New York

Reducing comb filtering on different musical instruments using time delay estimation

Room Acoustics. March 27th 2015

Virtual Sound Source Positioning and Mixing in 5.1 Implementation on the Real-Time System Genesis

Sonnet. we think differently!

UBL S119 LOUDSPEAKER SYSTEM

LINE ARRAY Q&A ABOUT LINE ARRAYS. Question: Why Line Arrays?

Added sounds for quiet vehicles

Technical Note Vol. 1, No. 10 Use Of The 46120K, 4671 OK, And 4660 Systems in Fixed instaiiation Sound Reinforcement

A White Paper on Danley Sound Labs Tapped Horn and Synergy Horn Technologies

DESIGN AND APPLICATION OF DDS-CONTROLLED, CARDIOID LOUDSPEAKER ARRAYS

APPLICATIONS OF A DIGITAL AUDIO-SIGNAL PROCESSOR IN T.V. SETS

Active Control of Energy Density in a Mock Cabin

Force versus Frequency Figure 1.

PROFESSIONAL. EdgeMax EM90 and EM180 In-Ceiling Loudspeakers. Design Guide

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

The Spatial Soundscape. James L. Barbour Swinburne University of Technology, Melbourne, Australia

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Fig 1 Microphone transducer types

CADP2 Technical Notes Vol. 1, No 1

Convention Paper 7057

Additional Reference Document

FREQUENCY RESPONSE AND LATENCY OF MEMS MICROPHONES: THEORY AND PRACTICE

Acoustics Research Institute

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

ALTERNATING CURRENT (AC)

Sound localization with multi-loudspeakers by usage of a coincident microphone array

Digitally controlled Active Noise Reduction with integrated Speech Communication

The Official Magazine of the National Association of Theatre Owners

ECMA-108. Measurement of Highfrequency. emitted by Information Technology and Telecommunications Equipment. 4 th Edition / December 2008

THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY

Psychoacoustics of 3D Sound Recording: Research and Practice

Sound Systems: Design and Optimization

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA)

Instruction Manual. Motion Picture Loudspeaker Systems. A. Introduction: 2. General Acoustical Characteristics:

Measuring procedures for the environmental parameters: Acoustic comfort

Psycho-acoustics (Sound characteristics, Masking, and Loudness)

A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February :54

CONTENTS. Preface...vii. Acknowledgments...ix. Chapter 1: Behavior of Sound...1. Chapter 2: The Ear and Hearing...11

Transcription:

Chapter 6: Room Acoustics and 3D Sound Processing Sound in rooms The shapes, dimensions and wall's surface structure of rooms have effect on sounds. How these affect sound is the subject of room acoustics. A process of sound generation and perception in a reverberant room Direct sound and reverberant sound The sound process happening in a room can be explained using ray model. The rays from the source travel outward in the diverging direction. At each encounter with the boundaries of the room, the rays are partly absorbed and partly reflected. The sound directly coming from the source is called direct sound, and the sound having undergone one or more reflections is called reverberant sound. 1

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms If the direct sound wave predominates almost everywhere, the room is anechoic (echo-free); rooms so designed are anechoic chambers. When the rooms are designed so that the reverberant wave predominates overwhelmingly, they are called reverberant chambers. The rooms in which we are living are neither anechoic nor reverberant chambers, but the rooms in between and with certain reverberant effects; they are called reverberant or live rooms. Impulse response of a reverberant room By using a loudspeaker as a sound source and a microphone as pickup, we can measure the impulse response of a reverberant room. A typical, representative measurement may look like this: Impulse response of a rectangular room 2

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms The first impulse to the microphone is the direct sound, the second impulse that is smaller is the first reflection from the surface closest to the microphone, the third, the forth and the others are all the reflected sounds from the first or multiple reflections. The reflected sounds become smaller and smaller because they have more encounters with the surfaces and thus get absorption by the surface. This sound process may vary from room to room, depending on the acoustic properties of the rooms. To specify the acoustic property of a room, one important acoustic parameter is always used that is the reverberation time, defined as the time required for the sound pressure to drop 60 db from the initial level. The acoustic intensity at any point in the room could build up to higher values that would exist if the source were operated in open air. In other words, the sound grows up in the room. For any given enclosure (or room) this gain is nearly proportional to the reverberation time. Therefore, a long reverberation time is desirable if a weak source of sound is to be audible everywhere in the room. The presence of reverberant acoustic energy tends to mask the immediate recognition of any new sound. Since the reverberation time is a direct measure of the persistence of such sounds, it is obvious that a short reverberation time is desirable to minimize masking effects. The choice of the best reverberation time for a particular room must, therefore, be a compromise. 3

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms Sabine Theory When a source of sound is started in a reverberant room, reflections at the walls produce a sound energy distribution that becomes more and more uniform with increasing time. Ultimately, except close to the source or to the absorbing surfaces, this energy distribution may be assumed to be completely uniform and to have essentially random local directions of flow. Reverberation time If the sound source in a reverberant room with uniformly diffuse sound is turned off at t = 0, the pressure at any later time can be found as follows P 2 er ( t) = P 2 er (0) exp( t / t E ) 2 where P er is the spatially averaged effective pressure amplitude of the reverberant sound field and t E is the time constant governing the growth of the acoustic energy in the room. The time required for the sound level to drop by 60 db is defined as the reverberation time T, which is expressed as T = t 55.2V e 60 10 log 10 e T =13.8t E = Ac with c = 343 m/s (speed of sound at 20 0 C), V is the volume of room and A is the total sound absorption of the room. 4

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms Reverberation time If the surface area of the room is S, the average Sabine absorption a is defined by a = A/ S and the reverberation time becomes 0.161V T = Sa The reverberation time is an important parameter determining the acoustic performance of a room. To predict the reverberation time of a room with given acoustic properties, one needs to know the total sound absorption that depends on the areas and absorptive properties of all the materials within the room. The total sound absorption is the sum of the absorptions A n of the individual surfaces, A A S a a n = n = n n n n where is the Sabine absorptivity of the nth surface. Thus, the average Sabine absorptivity becomes 1 a = S a n n S n Each a n is to be evaluated from standardized measurements on a sample of the material in a reverberant chamber. 5

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms Reverberation time Because of the frequency dependence of the absorption of each surface, it is necessary to specify the reverberation time for representative frequencies covering the entire range that is important to speech and music. The frequencies usually chosen are 125, 250, 500, 1000, 2000, and 4000 Hz. Conventionally, when the term reverberation time is used without specification of any particular frequency, it is generally understood to refer to the frequency of 500 Hz. 6

Room Acoustics Anechoic chambers, reverberant chambers and reverberant rooms Reverberation time An example showing measurements of reverberation times at three different frequencies, 125, 500 and 2500 Hz The reverberation time plays a central role in the quantitative formulation of some of the simpler criteria, like the example shown on the right hand side. 7

3D Sound 3D Sound Processing It generally refers to multichannel sound, not monophonic or stereo. 3D sound can be best understood in context of localizing (or spatializing) any particular sound. Surround Sound Stereo covers about a 120-degree frontal perspective. Although this provides significant realism, we can actually perceive sounds in a much wider perspective even in back of us. Surround-sound and quadraphonic sound systems attempt to reproduce sounds in both the front and back of the listener close to a 360-degree sound perspective. In playing back true surround or quadraphonic sound at least five speakers are needed. Six speakers are needed to take full advantage of the 5.1 capabilities. History in the development of surround sound Dolly introduced Dolby Stereo in the mid 1970s. Three front and one mono surround channel. A quad matrix is used to encode four channels into two channels. Enhanced to five discrete channels in 1992 (left, center, right and two independent left and right surround channels). A sixth, low frequency enhancement (LFE) channel was introduced to add more headroom and prevent the main speakers from overloading at low frequencies. The LFE channel was limited in bandwidth from 20 to 120Hz. It was named 5.1. The system was selected by ATSC and the standard for HDTV in US and also adopted as 8 standard for DVD-Video.

3D Sound History in the development of surround sound 3D Sound Processing In 1994, ITU released a standard document entitled ITU-R BS.775-1 Multichannel stereophonic sound with and without accompanying picture. It recommends one universal multichannel stereophonic sound system that is hierarchical and compatible with all broadcasting and recording standards. These include: mono; mono with mono surround with surround channel played over two loudspeakers (preferably decorrelated); two-channel stereo; two-channel stereo with one surround channel (split over two loud-speakers); two-channel stereo with two surround channels; three-channel stereo; three-channel stereo with one surround channel (split over two loudspeakers); three-channel stereo with two surround channels The requirements for each of the above systems include: control of the directional stability of the frontal sound image over a listening area larger than what is possible with two-channel stereo; the ambient sensation should be significantly enhanced over that provided by two-channel stereo; downward compatibility with sound systems that have a lower number of channels; upward compatibility with sound systems that have more loudspeakers than available signal channels; the audio quality after decoding must be subjectively indistinguishable from the reference 9 for most types of program material.

3D Sound 3D Sound Processing Surround Sound In the 10.2 channel surround system, additional wide, rear surround, and height loudspeaker are added to the standard five channels in the ITU recommendation. Also, two subwoofer channels are used on either side of the listening area. 10

3D Sound Processing 3D Sound Localization Human Binaural Perception Multichannel sound reproduction is naturally about spatial aspects of sound or stereo and so this section looks at the relevant aspects of two-eared listening. Cognitive Process Amplitude, Fundamental Frequency, timbre, envelope pattern, onset disparities, correlated changes, contrast with earlier and later sounds, spatial location Head and Pinna effects Humans listen with two ears. Two spaced ears give a mean time arrival difference for sounds in different locations of up to 0.7ms and intensity difference due to head shadow. These basic phenomena are at the root of the mechanisms that allow us to determine the direction of an external sound. The time-delay difference due to path-length and diffraction effects are very important factors for the localization of sound sources. Pinna effects also make important spectral modification according to angle of incidence and this filtering action combined with head diffraction is used to encode direction which is specified as azimuth and elevation angles from the listener. A significant result of all the research on sound perception and localization is the fact that the pinnae is an extremely important element in reliable sound localization. The spatial cues supplied by the pinnae to the brain contribute to what is called Head-Related Transfer Functions (HRTF) 11

3D Sound Processing 3D Sound Localization Azimuth Cues Rayleigh s Duplex Theory: According to this theory, there are two primary cues for azimuth -- Interaural Time Difference (ITD) and Interaural Level Difference (ILD). A sound wave from a distant source strikes a spherical head of radius a from a direction specified by the azimuth angle. Clearly, the sound arrives at the right ear before the left, since it has to travel the extra distance a θ + asinθ to reach the left ear. Dividing that by the speed of sound c, we obtain the following simple formula for the interaural time difference: a ITD = ( θ + sinθ ) c a=9cm, c=340m/s, ITD=0.68ms ITD is zero when the source is directly ahead, and is a maximum of a ( π / 2 + 1) / c when the source is off to one side. This represents a difference of arrival time of about 0.7 ms for a typical size human head, and is easily perceived. Since the incident sound waves are diffracted by the head, in addition to the time difference, there was also a significant difference between the signal levels at the two ears, i.e., the ILD. 12

3D Sound Processing 3D Sound Localization Azimuth Cues The ILD is highly frequency dependent. At low frequencies, where the wavelength of the sound is long relative to the head diameter, there is hardly any difference in sound pressure at the two ears. However, at high frequencies, where the wavelength is short, there may well be a 20-dB or greater difference. This is called the head-shadow effect, where the far ear is in the sound shadow of the head. The Duplex Theory asserts that the ILD and the ITD are complementary. At low frequencies (below about 1.5 khz), there is little ILD information, but the ITD shifts the waveform a fraction of a cycle, which is easily detected. At high frequencies (above about 1.5 khz), there is ambiguity in the ITD, since there are several cycles of shift, but the ILD resolves this directional ambiguity. Rayleigh's Duplex Theory says that the ILD and ITD taken together provide localization information throughout the audible frequency range. HRTF Model HRTF model for 3D sound synthesis x(t) T L (z) T R (z) H L (z) H R (z) P L (z) P R (z) x L (t) x R (t) ITD ILD Pinna Head 13

3D Sound Processing 3D Sound Localization Azimuth Cues The following figure shows an example of measurements giving the frequency response variation for single-tone point-sources at different azimuths for the near ear and in the horizontal plane. There are a number of features in the response that vary considerably with azimuth; note especially, the sharp notches that vary position with azimuth and probably provide a significant cue to position (i.e., spectral cues). The inter-aural time-delay for a pointsource signal at different azimuths. In this diagram zero azimuth is fully to one side. The variation in frequency response for single tones, measured in the ear-canal. The responses shown cover from full ahead (azimuth 0 ) to full behind and are for the near ear. The responses for the shadowed ear are obviously different again. 14

3D Sound Processing 3D Sound Localization Localisation: Temporal cues The previous sections reviewed the mechanisms by which amplitude differences could provide cues to the location of an external acoustic object. Another important source of information on externalization is in the time-structure of arriving sounds, and the relevant parameters are: onset and offset (overall envelope and transients) synchronously for waveforms or envelopes < 800Hz In addition to intensity cues, data arises in time and phase differences between the signals from both ears. It is an important requirement for a naturalsounding multichannel system that these different mechanisms are exploited in a co-ordinated way. Listener fatigue or confusion rapidly occurs when the location cues are contradictory. Localisation: Precedence effects It is well known that sounds often appear to come from the direction of first arrival, somewhat independently of amplitude. This is entirely reasonable especially since most naturally occurring sonic events will tend to make the first-arriving sound also the loudest. There is a trade-off between time-arrival difference and loudness effects. 15

3D Sound Localization 3D Sound Processing Localisation: Sound-field effects (dynamic cues) The normal two-eared listener will make head movements. Apart from small movements, which can rapidly aid the confirmation of a direction hypothesis, by far the most powerful direction-determining behaviour is to turn to face the direction of the apparent sound. Normally, an external sound will grab attention and the combination of cues from time-arrival and spectral changes, set up an initial listener-hypothesis of its location. If the sound continues, the listener can get a very accurate fix by turning the head to set up a similar sound in both ears. When the sound source is dead-ahead, each ear produces a similar response and the listener is facing perpendicular to the wavefront. Some stereo and pseudo-stereo systems do not achieve good agreement between the first hypothesis and the net wavefront. In particular, some methods of spatial encoding rely on equalisation to fool the pinna and head effects and may even require the listener to remain fixed thereby introducing a significant unreal quality to the percept. Sound-field replay methods look at the apparent direction of a source in the absence of a listener. Localisation can be confirmed by moving around or headturning. Masking Effects for Two-Ear Listening varies with position, approximately 7dB for 90 degree difference in azimuth angle 16

Multichannel Sound Matrixing Dolby Stereo Soundtrack Encoding 3D Sound Processing Dolby Stereo was introduced in 1982. There are 4 channels: Left, Center, and Right (across the front), and a single surround channel (produced by speakers all around the audience in a theater or two speakers to the sides of the listener in the home). The Left and Right channels are the carriers of the other two. That is, nothing special is done to these two input stems. They are the basis for what we will call LeftTotal and RightTotal, once the center and surround information have been added to them. This system is called a Matrix The Center channel stem is added equally to the Left and Right, but reduced by 3 db so as to maintain constant total power when being carried by two channels as it is. 17

3D Sound Processing Multichannel Sound Matrixing The surround channel is also input equally to both the Left and Right channels, but each half gets a 90 degree phase shift opposite from each other. The surround channel information that is in Left and Right therefore ends up being 1800 to each other. In other words, they are totally out of phase. Before getting added to the soundtrack, it gets a -3 db reduction (for the same reason as the center), is then filtered to the frequency range of 100 Hz to 7 khz, and undergoes a form of Dolby B-Type noise reduction. Remarks: all sounds are present in the LeftTotal/RightTotal carriers. When played back over a conventional 2 channel system, the center channel will be reproduced as a phantom image between left and right for persons seated in the center.. The surround channel will take on a diffuse character according to its out-of-phase nature. 18

3D Sound Processing Multichannel Sound Matrixing Dolby Stereo Soundtrack Decoding The simplest way to decode a matrixed soundtrack such as Dolby Stereo is a simple passive system, as was implemented with Dolby Surround. In it, LeftTotal and RightTotal are passed unchanged to the Left and Right outputs. With no center channel output, the system relies instead on a phantom image between the left and right speakers. To extract the surround, LeftTotal and RightTotal inputs are passed through a differential stage ('subtracted' from one another). Crosstalk leakage from one channel to another Left and Right channels: perfect separation as they are on separate carriers 3dB C 3dB Center and Surround channels: likewise near perfect, as surround information will by nature be phase-inverted 180 degrees and cancel itself out from the front left/right speakers Between any two adjacent channels, the perceived separation will be at best 3 db. L 3dB 40dB S 3dB R 19

3D Sound Processing Multichannel Sound Matrixing Dolby Stereo Soundtrack Decoding The surround channel is predominantly an ambiance channel, its sounds being omnipresent in nature. But to improve the perceived separation, the Dolby Surround decoder performs some processing of the surround channel, namely 7 khz low pass, delay, and noise reduction. The 7 khz low pass at input is of course applied in anticipation of the 7 khz low pass in the decoding process. It exists in the decoding stage because phase shift anomalies in the 2-channel delivery systems become more severe with increasing frequency. In other words, there is greater chance for decoding errors in the higher frequencies. Dialogue sibilance is a terrific example of highfrequency leakage from front to surround which would be particularly distracting. The band limiting of the surround channel is far from objectionable from a fidelity point of view since the channel is intended to deliver a diffuse surround sound effect, not a localizable point source. The delay circuit capitalizes on the Precedence or Haas Effect whereby if an identical sound comes from two sources, but one begins at least 10ms (milliseconds) sooner than the other, our brain will accept the sound as only coming from the first source. Leakage of front information to the surrounds is greatly minimized by this psycho-acoustic phenomena. 20

3D Sound Processing Multichannel Sound Matrixing Dolby Stereo Soundtrack Decoding If we were to simply and passively play back the center channel by summing LeftTotal and RightTotal, the result would not be ideal. Consider the most classic example of a front soundstage dilemma: We have dialogue from the center and music from the left and right. Because the center channel is the sum of the left and right, the music will also be output from the center channel. Likewise, because the center channel is in lefttotal and righttotal, it will also be output there. A passive center signal will therefore tend to narrow the perceived stereo effect considerably. 21

3D Sound Processing Multichannel Sound Matrixing Dolby Pro Logic: Directional Enhancement Dolby Pro Logic, introduced in 1987, is the active decoder which takes the two channel Dolby Surround signal and unfolds it into Left, Center, Right and a limited bandwidth Surround channel which is reproduced through the Left Surround and Right Surround speakers. The Pro Logic decoder, a direct descendant of the Dolby Stereo decoder, is best thought of as a passive decoder followed by an active matrix which provides directional enhancement though a constant power concept. Gain Riding was used in many early matrix systems. In such a system, the decoder detects which signal is dominant, e.g., during the dialogue, and reduces the gain of the other channels accordingly. Gain riding fails miserably in a dynamic environment where dialogue is starting and stopping: The gain of the left and right channels are bouncing up and down in response, producing a distracting "pumping" of the sound. 22

Multichannel Sound Matrixing 3D Sound Processing Dolby Pro Logic: Directional Enhancement More successful is the concept of signal canceling. In the case of the center being dominant (the dialogue), the decoder takes the Left signal, phase inverts it 180 degrees, and adds it to the Right output. The inverted center channel component from the left cancels out the center channel component in the right, eliminating the center channel component in the right output. The reciprocal is performed on the Left output. Unlike gain riding, the overall level of the soundtrack can be maintained. Lt Rt - L VCA Control At the heart of Pro Logic is Signal Dominance Detection. As a single channel becomes dominant, the remaining sounds are redistributed via signal masking, keeping the overall level of the soundtrack constant, but providing appropriate directional enhancement where and when it is needed. The Pro Logic decoder then has two modes of operation, passive and active, based on the presence of a dominant signal. It is passive until a dominant signal is detected, at which time it goes active, applying directional enhancement. Pro Logic's active matrix looks at the relative level and phase of its input, and sends the information to VCAs (Voltage Control Amplifiers) to control the level of antiphase signals (the "inverted copies") applied to each channel. This system has come to be known as "feed forward" in that the enhancements are applied looking forward down the decode line. 23

3D Sound Processing Multichannel Sound Matrixing Dolby Surround Pro Logic II : The next level Dolby Pro Logic II, which was introduced in 2000, is an advanced matrix decoder that derives five-channel surround (Left, Center, Right, Left Surround, and Right Surround) from any stereo program material, whether or not it has been specifically Dolby Surround encoded. On encoded material such as movie soundtracks, the sound is more like Dolby Digital 5.1, while on unencoded stereo material such as music CDs the effect is a wider, more involving soundfield. Among other improvements over Pro Logic, Pro Logic II provides two full-range surround channels, as opposed to Pro Logic s single, limitedbandwidth surround channel. Dolby Pro Logic II works by detecting the directional cues that happen in stereo audio and uses them to create fivechannel surround sound. L T = L + 1 C 2 j 19 25 R L j 6 25 R R L C R R T = R + 1 C + 2 j 19 25 R L + j 6 25 R R RR Listener Prologic IIx 5.1 to 6.1 or 7.1 Prologic IIz 5.1 to 7.1 or 9.1 with front height speakers RL 24

3D Sound Processing Multichannel Sound Matrixing Dolby Surround Pro Logic II : The next level Dolby Pro Logic II is an improved, more intelligent matrix decoder. It can playback nonencoded material such as conventional 2-channel music into 5 channels creating a wider sound field. In Pro Logic II, the most significant technical difference from Pro Logic is that it incorporates a "feed back" design, in that the directional enhancements are applied back to the input of the active matrix. 25

3D Sound Processing Multichannel Sound Matrixing Dolby Surround Pro Logic II To illustrate how Pro Logic II improves upon Pro Logic, consider a sound placed by the artist to image at 'half right', between the center speaker and the right speaker. This results in the sound being greater in RightTotal than LeftTotal. Because of this inequality, the passive decoder will not remove all of this front signal from the surround output. To achieve full cancellation of this signal not wanted in the surround output, Pro Logic II servos (negative feedback) the inputs to make them equal. VCAs on each input to the surround decoding stage share a common control signal to maintain equal input to the surround stage. In this example of the "half right" sound, the RightTotal signal going to the surround decoder will be lowered a little while the LeftTotal will be raised a little so that they perfectly cancel each other out at the surround decode stage. These same "balanced" signals are also used to feed the center decode stage, but are added rather than subtracted from each other to yield the center signal. 26

3D Sound Processing Multichannel Sound Matrixing Dolby Surround Pro Logic II Because it is controlling the Left and Right feeds, this servo is known as the Left-Right axis. There is a similar servo which operates independently on the Center-Surround axis. The net result is that Pro Logic II responds faster and is more accurate than its predecessor. Where a Pro Logic decoder sends a single surround signal to two speakers (in other words, mono), Pro Logic II decodes independent left and right surround outputs (in other words, stereo). Again, back to detecting signal dominance, the Pro Logic decoder has four cardinal vectors along which it is equipped to detect signal dominance: Left, Right, Center, and Surround. Pro Logic II simply includes two additional cardinal directions to detect signal dominance on, and derives its two surround outputs from these. 27

3D Sound Processing Multichannel Sound Matrixing Detecting Soundtrack Dominance Since it is the relative level of one sound to another which determines the perception of separation, it is desirable to have sensing circuits that ignore absolute signal level in favor of being responsive to the difference in level between two signals (the equivalent of taking their ratio). Electrically, this is no simple task, but, by taking the logarithm of each signal and subtracting one from the other, we can obtain a measure of relative dominance. AL log = log AL log AR A R The resultant control voltage in this logarithmic form closely follows the way loudness is perceived. Consequently, the final process provides separation enhancement directly corresponding with the amount needed to prevent crosstalk from becoming audible, and proportional to the ability of the dominant sound to mask the spatial redistribution of the non-dominant signals. Sensing Direction of Dominance Knowing which signal is dominant includes knowing the encoded position, or angle, of the signal. It is in this direction that enhancement must take place, and may encompass any point in a 360-degree circle, not just one of the four cardinal positions. 28

3D Sound Processing Multichannel Sound Matrixing Sensing Direction of Dominance By definition, dominance can only occur in one place at any instant in time; it cannot exist in two places simultaneously, since their equality of magnitude would mean that neither is dominant. (These two signals may together constitute a single dominant quantity, however.) Therefore, it is sufficient to be able to detect a single direction of soundfield dominance, no matter how rapidly the soundfield changes. With two independent, orthogonal signal pairs available in the encoded soundtrack (the left/right pair and the center/surround pair), it is possible to identify any point on an X-Y coordinate plane within a given boundary area. C 30dB 30dB L 30dB R 30dB 30dB S Typical Separation Map 29

3D Sound Processing Multichannel Sound Matrixing Dolby Surround Pro Logic II In addition to improved technology at the decode level, Pro Logic II incorporates additional features over Pro Logic: Bass management, made ubiquitous by the proliferation of Dolby Digital, is now a required part of the package. While Pro Logic II offers a "Pro Logic" mode which includes the band-limiting of the surrounds, in its native mode, the surrounds are full range and are independent Left/Right outputs. An optional "Width" control may be incorporated by the manufacturer (the receiver or processor manufacturer). An optional "Dimension" control may be incorporated by the manufacturer. 30 An optional "Panorama" control may be incorporated by the manufacturer.

Multichannel Sound Matrixing 3D Sound Processing Dolby Surround Pro Logic II ProLogic II offers two different decoding modes: Music and Movie. There are 3 optional parameters: Dimension, Panorama, and Width. The range of adjustments may be different for surround processors and receivers from other manufacturers. Dimension is used to adjust the front-to-rear balance in the room. The adjustment has 7 possible positions. The default position is 0 which has the balance equal. A setting of +3 puts the balance toward the front of the room and a setting of -3 puts the balance to the back of the room. Panorama is used to wrap the sound field around you. You have two options, On or Off. Width is a center spread adjustment. The default value is 3. A setting of 0 sends most of the information to the center channel, and a setting of 7 is full left and right only. This provides a phantom center image. In addition to the optional items above, Music Mode has the following characteristics: There is no delay added to the surround channels by default. You can adjust the delay from 0ms to 15ms. Autobalance mode is off by default. Autobalance tends to steer vocals toward the center channel. This is not always desired in music because a vocalist can be placed off center on purpose. The only option available in Movie Mode is a Pro Logic Retro option. The default value is Off. When this option is turned on, the surrounds are once again mono and they are bandwidth limited to 7 khz. They include this mode for backwards compatibility. In addition to the optional item above, Movie Mode has the following characteristics: There is a 10ms delay added to the surround channel. You can adjust the delay from 10ms to 25ms. (The original Pro Logic has a default delay of 15ms.) 31 Autobalance mode is on by default.

Guide to Home Theater Speaker Placement 3D Sound Processing The shape of your room and how it's furnished will affect the sound you hear. For instance, too many bare surfaces can cause reflections that may add harshness to the sound. Adding carpeting and drapes can help. If you have a choice of rooms, avoid ones that are perfectly square or have one dimension exactly twice another. These rooms can aggravate resonances that color the sound. If possible, center your seating area between the surround speakers. The closer you place a speaker to intersecting room surfaces (corners, wall and ceiling, wall and floor), the stronger the bass output. This can help bass-shy speakers, but it can also add too much bass. Again, just moving a speaker a few inches can often make a big difference in sound. There's a lot of suggestions and recommendations, but most mixers have their favorite positioning. The closest thing to a standard is ITU Recommendation 775. This was empirically developed by the BBC primarily for playback of Orchestral material, and isn't necessarily right for all playback material. Plus it isn't always possible to place the speakers where their supposed to go. For placement according to the ITU spec, first place the speakers all on the same plane. Then with the center speaker directly in front, the Left and Right speakers should be positioned 30 degrees away from center at about a 60-degree angle aiming for a spot from 3 to 6 inches behind the mixer's (or listener's) head. This angle can be reduced to 45 degrees or extended out to 60 or even 90 degrees and still provide satisfactory results. The surround speakers should be positioned about 110 degrees off center, which puts them to the sides and somewhat behind the listener. This is not only what often happens in typical homes, but has proved to be a good way to achieve a desirable front to back soundfield. If the surrounds are too far to the rear, the listener finds himself lost somewhere between two separate soundfields, rather than wrapped inside one cohesive soundfield. 32

Guide to Home Theater Speaker Placement 3D Sound Processing Instead of placing the rear speakers at 110 degrees or so, many mixers prefer to place them closer to a square with the front speakers. Sometimes the rear speakers aren't able to be placed ideally because of a door, couch, rack, etc. in the way. One thing to remember during placement, you either get great imaging or great envelopment, but not both. Rear speakers placed closer to 90 degrees provide better envelopment, but poor imaging. Rear speakers placed closer to a square array with the front speakers provide better imaging but not as good envelopment. The ITU standard provides the best compromise The goal of all speaker placement for movie soundtrack playback and multichannel music reproduction is a smooth, consistent and unbroken soundstage across the front, coupled with a sense of envelopment in the ambient surround effects. In Dolby Digital 5.1 soundtracks, most movie dialog is hard-mixed to the center channel, and if you experiment with center speaker placement, the dialog should be one with the actors on the screen. It shouldn't seem detached from the screen. If it is, you have your center channel too far away from the screen. Assuming you are centered in the middle of your couch, facing your TV display and centerchannel speaker at 0 degrees, then your left and right main front speakers should be within a 22- to 30-degree angle to each side, viewed from your seat. The main left and right surrounds should be to the respective sides of the listening area, above ear level if possible at an angle of 90 to 110 degrees from the front center. The following diagram gives some suggestions for corner setups. This setup applies to Dolby Pro Logic II playback. 33

Guide to Home Theater Speaker Placement 3D Sound Processing Subwoofers Since deep bass 80 Hz and below is non-directional, the subwoofer can go just about anywhere on the floor, but corners will give you the greatest enhancement of deep bass, at the risk of it sounding boomy. Moving a subwoofer or a floor-standing full-range speaker away from any intersecting room boundary will reduce the tendency to boom or to have too much bass. In either case, you will have to experiment to achieve smooth and extended deep bass in your preferred listening location. Bass output will vary in different spots in the room as a function of the room's dimensions, so aim for good bass extension in preferred seating locations. THX (T omlinson H olman ex periment) is a set of technologies from Lucasfilm first developed for the cinema and subsequently for the home. In the theater, THX standardizes the sonic environment by stipulating not only the acoustics required but the playback equipment as well. In the home, both electronic and speaker strategies are employed in order to have the program material more closely match that of the dubbing stage. Essentially, it was meant as a universal playback standard so movie theaters would all have pretty much the same playback response. It is not necessary to employ THX certification unless you're mixing a motion picture or your clients demand it, however. 34

Guide to Home Theater Speaker Placement 3D Sound Processing Room Equalization In the room volumes (under 6000 cu ft), several problems exhibit themselves no matter how flat or accurate a speaker is designed. The first problem is the existence of room modes or standing waves. These modes are a function of each room s shape and dimensions, and cause uneven frequency response with peaks and dips in the low frequency range. Since these standing waves occur at fixed locations for fixed frequencies within a room, there is no way to completely avoid these artifacts. However, since the audience for a home theater remains generally in one specific area of a room, prudent equalization can usually level these peaks and troughs and produce smoother response in the listening area. The proper positioning of the Subwoofer elements can do much to minimize these standing wave artifacts, but even with careful placement, bass equalization is usually necessary to restore a flat and accurate bass response. The second problem which room equalization is designed to correct is that of speaker/room boundary interactions. These boundary interactions are the same ones that make a loudspeaker appear to have more bass when placed in a corner versus the center of a room. As you can see, the placement (for example) of the screen LCR loudspeaker asymmetrically in a room can have serious repercussions. The speakers placed nearer to the room s boundaries will have a different tonal balance to those placed more centrally. The result of unequal tonal balance is that sounds can vary dramatically as they are panned across the three front channels, and dialogue from boundary-close loudspeakers may have poor intelligibility. In both circumstances, with room modes or with room boundary problems, a properly set up room equalizer can restore the accurate spectral and inter-channel tonal balance of a home theatre system. Remember that these effects will vary from room to room and 35 installation to installation.

Guide to Home Theater Speaker Placement Room Equalization 3D Sound Processing The Home THX Room Equalizer The Home THX Room Equalizer meets the exacting specifications of the Lucasfilm Home THX Audio program. It is specifically designed to have the wide dynamic range, low noise, and low distortion required by the demands of motion picture soundtracks. Careful attention was also paid to musical transparency. 36

Guide to Home Theater Speaker Placement 3D Sound Processing THX Room Equalization Set-Up Set Up Home THX Audio System Aim L, C, R loudspeakers using pink noise (available as calibration Wow! disc) Calibrate individual channel levels with controller s internal test signals Set Up Microphone positions Analyze & Equalize A) Disconnect Subwoofer, Left, and Right Channels B) Play Pink Noise through Center Channel, 85dB SPL C) Measure multiple locations, average readings, and equalize Center Channel D) Repeat C until measurements are consistent Repeat operations A,B,C, &D for Left, Right, and Subwoofer channels Why Pink Noise? White noise is a random signal with equal amplitude per frequency, and pink noise is a random signal with equal energy per octave. white noise sounds very bright. Pink noise contains equal energy per octave, closely reflects our psychoacoustic expectations of flat response. Because of this perception of flat tonal balance, pink noise is a very useful tool when using a spectrum analyzer with 1/3 octave or octave measurement intervals, and when comparing loudspeakers for spectral similarity by ear. 37

Guide to Home Theater Speaker Placement 3D Sound Processing Sound Response Correction: Dynamic Equalization Some advanced audio equalizer correct errors introduced by the listening environment by seeking to ameliorate reflections from the walls, floor, and furniture in a room which cause uneven frequency response problems like boomy bass, imprecise imaging and soundstanging. Reflections are a time-varying effect: a distortion due to a reflection only occurs some time after the main signal because the reflection needs time to bounce against a wall and reach the listener, and a reflection does not travel about forever because it loses energy each time it encounters a wall, furniture, or the listener. It's clear that a conventional equalizer cannot correct for reflections because it will apply its correction before the reflection arrives at the listener, and will still apply its correction well after a reflection leaves the listener. For a reflection, a conventional equalizer will be correct only at one instance in time: when the reflection arrives at the listener. At all other times, the equalizer will be coloring the sound you hear. Digital Room Correction (DRC) is a process where digital filters designed to ameliorate unfavorable effects of a room s acoustics are applied to the input of a sound reproduction system. The configuration of a DRC system begins with measuring the impulse response of the room at the listening location for each of the loudspeakers. Then, DSP software is used to compute a FIR filter, which reverses the effects of the room and linear distortion in the loudspeakers. Finally, the calculated filter is applied in real time to cancel out the room errors, 38 leaving only the audio signal.