ELEC 484: Final Project Report Developing an Artificial Reverberation System for a Virtual Sound Stage

ELEC 484: Final Project Report Developing an Artificial Reverberation System for a Virtual Sound Stage Sondra K. Moyls V00213653 Professor: Peter Driessen Wednesday August 7, 2013

Table of Contents 1.0 Abstract 3 2.0 Virtual Room Design and Applications 4 2.1 Motivation 4 2.2 Design Concept 4 3.0 Introduction to Artificial Reverb 7 3.1 Reverberation Basics 7 3.2 Schroeder and Early Artificial Reverberation 8 3.3 Moorer s Algorithm 10 4.0 Building a Reverberator in Max/MSP 11 4.1 Working with Max/MSP 11 4.2 Tapped Delay Line 11 4.3 Parallel IIR Comb Filters 13 4.4 All pass Filter 15 4.5 Using the Program 15 5.0 Conclusion and Future Development 18 6.0 Max/MSP Patch Files 19 7.0 References 20

1.0 Abstract This report discusses the implementation of an artificial reverb system using Max/MSP and Moorer s artificial reverb algorithm that could be used for the future development of a virtual sound stage for sound laying in film and television. The final implementation, up to the point of writing this report, allows a user to upload (and loop) up to 3 audio files, manipulate the size and reverberance of the virtual space these files occupy, vary the distance at which the sound is placed from the listener, and adjust the gain of each individual sound, as well as the overall mix.

2.0 Virtual Room Design and Applications 2.1 Motivation Sound in film and television, particularly music and foley sounds, are often added to scenes after they have been shot and edited. In order to create a soundtrack for a given scene, many things need to be considered, including the size and reverberance of the setting, objects in the room, the sounds of the actors clothing, as well as the emotional weight of objects and events in the scene. While many virtual sound stages exist to simulate the placement of musicians in a concert hall, or to simulate sound in virtual reality and video game systems, none seem to exist with the idea of applying sound effects to a motion picture. The concept of designing a sound stage with film in mind would allow the ability to physically place sounds within the scene, as if the sound originated from the source itself. 2.2 Design Concept By creating a virtual room around the setting of the scene, sounds could be layered to create a sense of depth and a more realistic response to the setting. Figure 1 below illustrates how sound sources are layered within a film, Figure 2, outlines how this could be applied to a scene from the 2013 film Pacific Rim:

Figure 1. The sound depth of a film. The sounds in the foreground are the most direct. Adjusting the background sounds adds depth and realism to the scene. Figure 2. A scene from Pacfic Rim (2013 Warner Bros. Pictures). The main dialogue between Mako (Rinko Kikuchi) and Raleigh (Charlie Hunnam) is in the foreground. The voices and footsteps of crowd behind them add depth to the sound field. The area they are walking through is a large metal reverberant space, and the reverberance of the background sounds adds to the size and weight of the location.

The central factor in developing a virtual soundstage is the design of a user controlled reverb system. This system provides the framework for adjusting the distance of the sound from the viewer/user, the sounds overall amplitude, and information about the physical space. In the example above, the scene is in a very reverberant setting. The dialogue between the main actors is the most direct, whereas the footsteps and voices of the crowd behind them is loud and booming. A soundstage would allow the user to take sound effects that have been close miced, and layer them into the scene, creating more realistic audible cues to the size of the space, and adding the intensity of moment. Before I discuss my implementation of a simple sound stage, the following section discusses the concepts behind artificial reverberation up to Moorer s model, which is implemented in the final design.

3.0 Introduction to Artificial Reverb 3.1 Reverberation Basics Reverberation refers to the audible effects of an acoustic space on a sound. A musical performance in a concert hall, and a musical performance in a cafe can sound drastically different due to the size of the space and the physical materials the room is constructed from. The direct sound arrives at the listener first, at time T0, followed by the early reflections of the sound off nearby surfaces. The late reverberations arrive last, defined by the decay rate. The decay rate refers to the time it takes these reflections to attenuate 60dB in level, which is also denoted as T60 or RT60. Together, these properties make up the impulse response of a space. Figure 3. Model of a Generic Room Impulse Response The decay rate (or reverberation time) can be approximated mathematically, as was first proposed by Sabine in 1900[3]. The reverberation decay time is inversely proportional to the sound absorption of the space, where A is the absorption and V is the volume of the room is m 3 :

Manfred Schroeder later expanded on this model of calculating the reverberation time by integrating the impulse response of the room to get the energy decay curve (EDC), which is defined by: The EDC refers to the amount of energy the signal has at time t [1]. A room s effect is also defined by it s normal modes of vibration, the frequencies that are naturally amplified by a given space. Again, these can be calculated mathematically, and effect the impulse response of an acoustic space. 3.2 Schroeder and Early Artificial Reverberation The physical properties of a real room can be vary complex, as seen above. The need to approximate acoustic spaces outside of those available for a performance has existed since the 1920s, where reverberation chambers were used for music broadcasting and recording [3]. The first digital reverb algorithms were proposed in the early 1960s by Manfred Schroeder at Bell Labs, which introduced the use of recursive comb-filters and delay-based all-pass filters to simulate reverberation echoes. One of his designs using these components can be seen in Figure 4 below.

Figure 4. One of Schroeder s reverberators. This one uses 4 parallel comb filters followed by two comb filters. The input signal would also be fed forward to the output to simulate the direct sound. By replacing the transfer function A(z) of the all-pass filters delay lines (Figure 3), the all-pass filters allow for a dense impulse response and a flat frequency response [4]. The comb filters provide some independent control over the delay of the reverberated sound, as well as the decay rate [3]. Together, these components simulate the effect of wall reflections. In both the comb filters and the all-pass filters, the magnitude of g, must be less than 1 for the system to be stable. Figure 5. All-pass filter with delay line.

3.3 Moorer s Algorithm James Moorer built on the artificial system proposed by Schroeder by adding a tapped delay line to simulate a room s early reflections. These delay lines are then directed through a series of six parallel comb filters, followed by an all-pass filter, then delayed to allow the last of the early reflections through to the output, before the last of the late reflections is generated by the filters. Figure 6. Moorer s Reverberator Each of the comb filters contains a low-pass filter in the feedback loop, which aids in smearing the echoes of short implosive sounds, adding to a smoother decay [2]. The all-pass filter, the same as that proposed by Schroeder, creates a dense impulse response, while maintaining a flat frequency response; however, transient frequencies are more heavily effected by the phase response of the filter, which can cause a metallic ringing in the output. This will be further explored later.

4.0 Building a Reverberator in Max/MSP 4.1 Working with Max/MSP Prior to this project, I had only a basic working knowledge of Max/MSP. I decided to use it primarily based on it s adaptability and the ability to make real-time adjustments. This proved extremely useful in fine tuning values for the delays and gains, as well as identifying what components of this reverberator where producing which parts of the sound. I made a conscience effort not to look at some of the existing reverb patches in Max/MSP, so that they would not influence the design process. Two notable patches that exist in Max are yafr (Yet Another Free Reverb) and Freeverb, designed by Olaf Matthes, which provides a stereo implementation of the Schroeder/Moorer model; however, the external to download this patch is no longer available. Moorer s algorithm was used to keep this implementation simple, and to further explorer the development of a completely artificial reverb system. Max 6.0.7 was used to implement this project. 4.2 Tapped Delay Line The first part of the reverberation system implements an 11-tap delay line. Experimenting with the number of M taps found that 7 or more provided a good simulation of early reflections, although more allowed for a more noticeable audible distance between setting the room as larger and smaller. The delay lengths were chosen between 4 and 80 ms, with gains between 0.13 and 0.85, values modeled after the idealized geometric simulation of the Boston Concert Hall proposed by Moorer [3]. Table 1 outlines the values used.

TAP TIME (ms) GAIN 0 0 1 1 4.3 0.841 2 21.5 0.504 3 22.5 0.491 4 26.8 0.379 5 27.0 0.38 6 29.8 0.346 7 45.8 0.289 8 48.5 0.272 9 57.2 0.12 10 61.2 0.167 11 70.7 0.135 Table 1. Delay lengths and gains for the delay time simulates early reflections. The tapped delay line not only provides the early reflections, but also allows the user to control the distance of the sound from the listener. This is done by having a scale multiplier that multiples the incoming gain for a tap by a constant between 1 and 1.9 (depending on the position of the slider). A value of 1.9 is used as the maximum to prevent any of the gain values from exceeding 1, which could cause the system to become unstable. As the distance slider moves the sound farther away, the gain of the early reflections increases while the gain of the direct signal decreases. This causes the early reflections to become slightly more prominent, while decreasing the gain of the direct signal. The implementation of the tapped delay line patch can be seen below (for a larger image, see the included image file FIR_delay_line_11.jpeg).

Figure 7. FIR Tapped Delay Line with 11 taps (For larger image see FIR_delay_line_11, or view in the Max patch) 4.3 Parallel IIR Comb Filters The lowpass feedback comb filter proposed by Schroeder was also used by Moore in his reverb implementation, and is pictured in Figure 8, below: Figure 8. Lowpass Feedback IIR filter The addition of the lowpass filter contributes to a smooth decay, but also allows control over how reverberant the room is. As the gain coefficient d is increased, the room sounds more wet. As it is decreased, the room sounds more dry. The values of d have been limited between 0.02 and 0.3, as values less that 0.02 do not produce a noticeable difference, and values above 0.3 can create an unwanted buzzing in the output signal. As f varies from 0.1 to 0.9 the overall size of the room increases. In this way, a high value for f and a high value for d will create a more cathedral

like sound, and a low value for f and d will create a sound akin to a small room with very little reverberance. While Moorer suggests independent gain values for each of the six parallel comb filters, I have not implemented the filters will different gains here. Each filter changes in the same way when the sliders for f and d are manipulated; however, one difference is that as the value of f increases, the number of samples N being delayed in the comb filter is increased. In this way, as the room becomes bigger, the delay affects more of the signal, causing more later reflections. Values of N between 70 and 7500 samples provided the smoothest result. For values greater than 5000, the reverberator becomes very metallic and unnatural sounding. The Max/MSP implementation for the comb filter can be seen below in Figure 9: Figure 9. Implementation of Lowpass Feedback Comb Filter

4.4 All pass Filter The final component of the Moorer reverb algorithm is an all-pass comb filter. This filter has a delay length of 6ms and a gain coefficient of 0.7, as suggested by Moorer [2]. The Max/MSP implementation is pictured below. Figure 10. All-pass filter 4.5 Using the Program The components discussed about have been combined into a maxpatch called p_myreverb. The final.maxpat file provides a rough interface for the user to load in up to three audio files and manipulate them simultaneously within a given room as specified by the values of the Room and Damping sliders. Each sound can be placed at a specified distance with it s own individual gain value. The sounds can also be looped. The interface does not give any exact values, but allows the user to approximate them with the sliders. The output can also be recorded to be applied to a scene or composition.

4.6 Observations Overall, the reverberator works well to layer sounds in a scene. Some examples have been provided in the output files output_phone_footstps.aif and bathroom_reveb_test.aif. The former is derived by combining the wooden-stairs-2.wav, telephone-ring-4.wav, and bathroomfaucet-1.wav input sounds, which are included in the folder sample_sounds. The room was set to a medium size with medium reverberance. The phone ringing was placed in the background, the water running in the mid-ground, and the footsteps in the foreground. The second output file uses the bathroom-faucet-1.wav and the brushing-teeth-1.wav sound files. The room is set to be large and very wet to simulate a bathroom scene. Notable problems occur on sounds with transient content, as can be heard in the output file sneeze_large_rev_room.aif. These sounds tend to sound more metallic, especially in the reverb tail when the room is set to be large. The phone clip, footsteps, stairs, and laughing sounds (which don t contain transient information) sound relatively natural when put through the reverb system; however the sneezing sound, as well as plosive and fricative sounds in speech can have a similar ringing. This is largely due to the natural of the all-pass filter, which audibly effects the phase of these high transient sounds. The same problem is described by Moorer: If there are any clicks or pops in the sound being reverberated, a short all-pass will surround each click with a puff consisting of it s own impluse response, sounding not unlike a very quiet crash symbol [3]. I experimented with adding a lowpass filter after the all-pass to try and remove some more of the puffing metallic sounds, but in order to do so, I had to set to all-pass at a point where information in the signal was also cut out, causing to direct sound to become more distant and muddy. Figure 11 and 12 below show the result of a higher, more transient impulse response (generated from a click track in audacity), followed by an impulse response generated by a hammer sound effect. The gradation in the more transient click sound is very noticeable compared to less transient the hammer.

Figure 11. Impulse response with a click in a small dry room, medium semi-reverberant room, and a wet reverberant room, respectively Figure 11. Impulse response with a hammer sound in a small dry room, medium semireverberant room, and a wet reverberant room, respectively

5.0 Conclusion and Future Development Moorer s reverberator algorithm lends itself well to a simple reverb system. It is easy to understand and manipulate the various components, and can be fine tuned to sound very good on certain sounds. For a very basic sound stage, it provides enough control to position a sound in a given space, with some proximity to the listener. The downsides are that the algorithm itself does not provide measurable numbers in relation to real room acoustics, and relies mostly on the ear to find numerical values that work (although the can be modeled from real impulse responses as well). Further development of this concept may be better realized with a different artificial reverberation algorithm, or by using reverberation tails and performing convolution with the input signal. This implementation is also very limited in terms of how the user can move a sound in space. A full implementation would provide a proper visual interface to place the sounds, and also allow sounds to be placed from left to right, expanding the stage into a stereo image. The use of ray tracing methods would allow the placement of objects in the virtual space, that would further affect how the sound is reflected. In order for the sound stage to be practical for syncing with visuals, sounds would need to be placed in a time stream to correspond to the events on the screen as well; however, this first implementation does provide a quick means of modifying sounds in relation to one another in real time, without needing to open multiple plugins.

6.0 Max/MSP Patch Files 6.1 Audio Files Provided All test files can be found in the folder sample_sounds. With the exception of Toms_diner.wav (which is from the DAFX website) and impulse.wav (which was made in Audacity) all the sound files are from http://www.soundjay.com/index.html. 6.2 Output Files All pre-generated output files can be found in the folder output. output_phone_footsteps.aif - room size and damping set to mid values. Phone ringing placed most distant from the user (telephone-ring-4.wav). Footsteps coming down the stairs in the foreground (wooden-stairs-2.wav). Water running in the middle ground (bathroom-faucet.wav). bathroom_reverb_test.aif - room size is set to be large and very reverberant to model a bathroom. Water running in the background (bathroom-faucet.wav), and someone brushing their teeth in the foreground (brushing-teeth-1.wav). sneeze_large_rev_room.aif - room is set a notch below the largest and most reverberant. Only the sneezing-1.wav sound is used. 6.3 Max/MSP Files final.maxpatch - contains my_reverb patch, which contains the delay line, allpass filter, and comb filters implemented

7.0 References [1] Frenette, Jasmin. Reducing Artificial Reverberation Requirements Using Time-Varient Feedback Delay Networks. University of Miami, Decemeber 2000. http://www.music.miami.edu/ programs/mue/mue2003/research/jfrenette/index.html [2]James A. Moorer. About this Reverberation Business, Computer Music Journal, Vol. 3, No. 2, Pp. 13-28, June 1979 [3] Välimäki, Vesa et al. Fifty Years of Artificial Reverberation, IEEE Transactions on Audio, Speech, and Language Processing. Vol 20, No. 5, Pp. 1421-1448 [4] Zölzer, Udo. DAFX: Digital Audio Effects 2nd Edition, John Wiley & Sons Ltd, 2012 Audio Files from: Sound Jay http://www.soundjay.com/index.html DAFX Textbook WebPage MATLAB files http://www2.hsu-hh.de/ant/dafx2002/ DAFX_Book_Page_2nd_edition/matlab.html