Signal Processing, Acoustics, and Psychoacoustics for High Quality Desktop Audio

Size: px
Start display at page:

Download "Signal Processing, Acoustics, and Psychoacoustics for High Quality Desktop Audio"

Transcription

1 MS. No. JVIS Revised Signal Processing, Acoustics, and Psychoacoustics for High Quality Desktop Audio Chris Kyriakakis, Tomlinson Holman *, Jong-Soong Lim, Hai Hong, and Hartmut Neven Integrated Media Systems Center University of Southern California * Also with TMH Corporation, 3375 S. Hoover Str., Suite J, Los Angeles, CA

2 Signal Processing, Acoustics, and Psychoacoustics for High Quality Desktop Audio Contact: Chris Kyriakakis Integrated Media Systems Center Immersive Audio Laboratory University of Southern California 3740 McClintock Ave., EEB 432 Los Angeles, CA Tel: (213) Fax: (213) Abstract Integrated media workstations are increasingly being used for creating, editing, and monitoring sound that is associated with video or computer-generated images. While the requirements for high quality reproduction in large-scale systems are well understood, these have not yet been adequately translated to the workstation environment. In this paper we discuss several factors that pertain to high quality sound reproduction at the desktop including acoustical and psychoacoustical considerations, signal processing requirements, and the importance of dynamically adapting the reproduced sound as the listener s head moves. We present a desktop audio system that incorporates several novel design requirements and integrates vision-based listener-tracking for accurate spatial sound reproduction. We conclude with a discussion of the role the pinnae play in immersive (3- D) audio reproduction and present a method of pinna classification that allows users to select a set of parameters that closely match their individual listening characteristics. 2

3 List of Symbols None 3

4 1. INTRODUCTION Numerous applications are currently envisioned for integrated media workstations. The principal function of such systems is to manipulate, edit, and display still images, video, and computer animation and graphics. The necessity, however, to accurately monitor the sound associated with visual images created and edited in the desktop environment has only recently been recognized. This is largely due to the increased use of digital audio workstations that have benefited from rapid advances both in main CPU computational power, as well as in special-purpose DSP s. Many sound editing operations that could previously only be performed in calibrated (and very costly) dubbing stages are now routinely performed on digital audio workstations. In addition to accurate reproduction of the measurable characteristics of sound (e.g., frequency response and dynamic range), multichannel and emerging 3-D audio program material requires accurate spatial perception of sound as well in order to create a seamless aural environment and achieve sound localization relative to visual images. For such material, a mismatch between the aurallyperceived and visually-observed positions of a particular sound causes a cognitive dissonance that can seriously limit the desired suspension of disbelief [1]. Applications for high-quality desktop audio include professional sound editing for film and television, immersive telepresence, augmented and virtual reality, distance learning, and home entertainment. Such a wide variety of applications has led to an equally wide variety of interrelated, and at times conflicting, system requirements that arise from fundamental physical limitations as well as current technological drawbacks [2]. For example, while there have been advances in sound recording and reproduction technologies, as well as in the understanding of human sound perception mechanisms, these have not yet been combined in such a way as to achieve accurate synthesis of fully 3-4

5 D auditory scenes. Furthermore, many acoustical and psychoacoustical issues that pertain to sound reproduction in large rooms have not yet been correctly translated to the desktop environment. In this paper we examine several key issues in the implementation of high quality desktop-based audio systems. Such issues include the optimization of the frequency response over a given frequency range, the dynamic range, and stereo imaging subject to constraints imposed by room acoustics and human listening characteristics. Several problems that are particular to the desktop environment will be discussed including the frequency response anomalies that arise due to the local acoustical environment, the proximity of the listener to the loudspeakers, the acoustics associated with small rooms, and the location and orientation of the listener s head relative to the loudspeakers. We will address these issues from three complementary perspectives: identification of limitations that affect the performance of desktop audio systems; evaluation of the current status of desktop audio system development with respect to such limits; and delineation of technological considerations that impact present and future system design and development. 2. LIMITATIONS OF DESKTOP AUDIO SYSTEMS There are two classes of limitations that impede the implementation of seamless audio reproduction. The first class encompasses limitations imposed by physical laws, and its understanding is essential for determining the feasibility of a particular technology with respect to the absolute physical limits. Many such fundamental limitations are not directly dependent on the choice of systems, but instead pertain to the actual process of sound propagation and attenuation in irregularly-shaped rooms. For example, in order to recreate a environment for a listener it is necessary to encode the acoustical characteristics of the remote location during the recording and then decode those characteristics locally in the user s environment. Furthermore, the influence of the local acoustic environment on the 5

6 perception of spatial attributes such as direction and distance, as well as in colorations that arise from anomalies in the frequency response, must be taken into account. The situation is further complicated by the fact that the decoding process includes the physiological signal processing performed by the human hearing mechanisms. This processing translates level and time differences and direction-dependent frequency response effects caused by the pinna, head, and torso into sound localization cues through a set of amplitude and phase transformations known as the head-related transfer functions (HRTFs). The second class of limitations contains a number of constraints that arise purely from technological considerations. These technological constraints are equally useful in understanding the potential applications of a given system and are imposed by the particular technology chosen for system implementation. For example, there are two choices for delivering sound in a desktop environment. The first is based on headphones that are capable of reproducing signals to each ear individually. While in certain applications this method can be very effective because it eliminates crosstalk, it suffers from three main drawbacks: (1) there are large errors in sound position perception associated with headphones, especially for the most important visual direction, out in front; (2) it is very difficult to externalize sounds and avoid the inside-the-head sensation; and (3) headphones are uncomfortable for extended periods of time [3, 4]. In this paper we will focus our discussion on loudspeaker-based reproduction. 3. REQUIREMENTS FOR HIGH QUALITY SOUND A significant amount of work in the area of high quality sound production and reproduction in large rooms has originated from the film industry. A well-defined set of standards has been developed for sound monitoring conditions in dubbing stages to ensure the transparent reproduction of program material in theaters. Such standards include loudspeaker positioning for multichannel monitoring, loudspeaker frequency response and 6

7 directivity requirements, precise sound pressure level calibration, control of room acoustics parameters (such as reverberation time and discrete reflections), and background noise levels. Meeting these standards ensures that material produced in one professional dubbing stage can be monitored under identical conditions in another dubbing stage or in a movie theater. The design challenge in desktop audio systems is to successfully map these standards onto the desktop environment through appropriate acoustical and psychoacoustical scaling and system design. 3.1 Acoustical Considerations In a typical desktop sound monitoring environment delivery of stereophonic sound is achieved through two loudspeakers that are typically placed on either side of a video or computer monitor. This environment, combined with the acoustical problems of small rooms, causes severe problems that contribute to audible distortion of the reproduced sound [5]. While an experienced professional can identify and correct for such problems during the monitoring stage, any changes made are permanently recorded and appear as errors during playback in a different environment. Among these problems the one most often neglected is the effect of discrete early reflections. The effects of such reflections on sound quality has been studied extensively [5-8] and it has been shown that they are the dominant source of monitoring non-uniformities when all the other standards discussed above have been met. These non-uniformities appear in the form of colorations (frequency response anomalies) in rooms with an early reflection level that exceeds 15 db spectrum level relative to the direct sound for the first 15 ms [9, 10] (Figs. 1, 2). Such a high level of reflected sound gives rise to comb filtering in the frequency domain that in turn causes noticeable changes in timbre. The perceived effects of such distortions were not quantified until psychoacoustic experiments [6, 11] demonstrated their importance. 7

8 A potential solution that alleviates the problems of early reflections in small rooms is near-field monitoring. In theory, the direct sound is dominant when the listener is very close to the loudspeakers thus reducing the room effects to below audibility. In practice, however, there are several issues that must be addressed in order to provide high quality sound. One such issue relates to the large reflecting surfaces that are typically present near the loudspeakers. Strong reflections from a console or a video/computer monitor act as baffle extensions for the loudspeaker resulting in a boost of mid-bass frequencies. Furthermore, even if it were possible to place the loudspeakers far away from large reflecting surfaces, this would only solve the problem for middle and high frequencies. Low frequency room modes do not depend on surfaces in the local acoustical environment, but rather on the physical size of the room. These modes produce standing waves that give rise to large variations in frequency response (Fig. 3). Finally, another factor that has a negative effect on the quality of reproduced sound relates to the physical size of the loudspeakers. Typical two-way designs in which the tweeter is physically separated from the woofer exhibit strong radiation pattern changes in the crossover frequency range. Amplitude and phase matching in this frequency range becomes critical and as a result such speakers are extremely sensitive to placement and typically produce a flat frequency response for direct sound in one exact position. This limitation makes typical two-way speakers unsuitable for near-field monitoring. The current state-of-the-art in desktop reproduction systems is rather poor both for lowcost as well as for high-cost near-field monitors (Fig. 3). As can be clearly seen from the measured frequency response there are large deviations from flat response that arise from a combination of loudspeaker design and acoustical environment drawbacks. The sound reproduced by such systems does not meet the standards required for professional applications and does a very poor job at translating the experience of a large theater or dubbing stage to the desktop. Furthermore, such distortions in the reproduced sound can 8

9 obscure problems present in the original recording that only become apparent in the finished product. 3.2 Design Requirements In order to address the problems described above, a set of solutions has been developed for single listener desktop reproduction that delivers sound quality equivalent a calibrated dubbing stage [5]. These solutions include: Direct-path dominant design. By combining elements of psychoacoustics in the system design, it is possible to place the listener in a direct sound field that is dominant over the reflected and reverberant sound. The colorations that arise due to such effects are eliminated and this results in a listening experience that is dramatically different than what is achievable through traditional near-field monitoring methods. The design considerations for this direct-path dominant design include the effect of the video/computer monitor that extends the loudspeaker baffle, as well as the large reflecting surface on which the computer keyboard typically rests. Correct low-frequency response. There are severe problems in the uniformity of lowfrequency response that arise from the standing waves associated with the acoustics of small rooms. Such anomalies can give rise to variations as large as ±15 db for different listening locations in a typical room. The advantage of desktop audio systems lies in the fact that the position of the loudspeakers and, to a large extent, the listener are known a priori. It is, therefore, possible to use equalization to produce very smooth low-frequency response. One fundamental limitation imposed by small room acoustics is that this can only be achieved for a relatively-small volume of space centered around the listener. One possible solution to this problem can be found by tracking the listener s position and adjusting the equalization dynamically. An early version of such a system is described in a later section of this paper. 9

10 3.3 Equalization Requirements The desktop environment presents an unusual set of requirements for equalization as compared to sound systems designed for larger venues. Conventional processing methods based on 1/3-octave-band, constant-q equalization that are derived from the critical band theory of hearing are not applicable in this case. The basic assertion of critical band theory states that frequency components that lie very close to each other are perceived differently from those components that are further apart. In a conventional listening environment in which the listener typically sits farther away from the loudspeakers (and thus perceives more of the reverberant field), it is possible to equalize using conventional methods. In a desktop listening environment, however, this theory breaks down because the direct field is dominant and the effects of the room are only present at low frequencies. The number of standing waves per 1/3-octave necessary for a diffuse sound field is high enough only above a certain frequency (called the Schroeder frequency). Below this frequency standing waves can give rise to level variations that can cause frequency components that lie very close to each other to be reproduced at very different levels. This violates the conditions necessary for equalization based on critical band theory and instead necessitates the use of parametric equalizers with filters that can be precisely tuned in center frequency and bandwidth. One advantage provided by desktop sound systems over their large room counterparts arises from the fact that standing waves are formed very rapidly as compared to the buildup time observed in large rooms. In a large room the equalized steady-state sound combined with non-equalized reflected (transient) sound can give rise to a worse overall sound quality. In the desktop environment the time difference of arrival between the direct and transient sound is so small that equalization of the steady-state sound is perceived as optimal for the combined sound as well. 10

11 4. LISTENER LOCATION CONSIDERATIONS In large rooms multichannel sound systems are used to convey sound images that are primarily confined to the horizontal plane and are uniformly distributed over the audience area. Typical systems used for cinema reproduction use three front channels (left, center, right), two surround channels (left and right surround), and a separate low-frequency channel. Such 5.1 channel systems are designed to provide accurate sound localization relative to visual images in front of the listener and diffuse (ambient) sound to the sides and behind the listener. The use of a center loudspeaker helps create a solid sound image between the left and right loudspeakers and anchors the sound to the center of the stage. For desktop applications, in which a single user is located in front of a CRT display, we no longer have the luxury of a center loudspeaker because that position is occupied by the display. In such cases sound is reproduced mainly through the use of two loudspeakers placed symmetrically on either side of the CRT, two surround loudspeakers placed to the side and above the listening position. Size limitations prevent the front loudspeakers from being capable of reproducing the entire spectrum, thus a separate subwoofer loudspeaker is used to reproduce the low frequencies. The two front loudspeakers can create a virtual (phantom) image that appears to originate from the exact center of the display provided that the listener is seated symmetrically with respect to the loudspeakers. With proper head and loudspeaker placement, it is possible to recreate a spatially-accurate soundfield with the correct frequency response in one exact position, the sweet spot. However, even in this static case, the sound originating from each loudspeaker arrives at each ear at different times (about 200 µs apart), thereby giving rise to acoustic crosstalk. These time differences combined with reflection and diffraction effects caused by the head lead to frequency response anomalies that are perceived as a lack of clarity [12]. 11

12 This problem can be solved by adding a crosstalk cancellation filter to the signal of each loudspeaker. The idea is to design a filter that generates a signal out of phase and delayed by the amount of time it takes the sound to reach the opposite ear. This signal combines with the in-phase signal from the opposite loudspeaker to create a cancellation of the undesired crosstalk. This method was initially introduced by Schroeder and Atal [13] and later refined by Cooper and Bauck [14] who coined the term transaural audio. While this solution may be satisfactory for the static case, as soon as the listener moves even slightly, the conditions for cancellation are no longer met and the phantom image moves towards the closest loudspeaker because of the precedence effect. In order, therefore, to achieve the highest possible quality of sound for a non-stationary listener and preserve the spatial information in the original material it is necessary to know the precise location of the listener relative to the loudspeakers. In the section below we describe an experimental system that incorporates a novel listener-tracking method in order to overcome the difficulties associated with two-ear listening, as well as the technological limitations imposed by loudspeaker-based desktop audio systems. 4.1 Vision-Based Listener Tracking Computer vision has historically been considered problematic particularly for tasks that require object recognition. Up to now the complexity of vision-based approaches has prevented them from being incorporated into desktop-based integrated media systems. Recently, however, the Laboratory of Computational and Biological Vision at USC, has developed a vision architecture that is capable of recognizing the identity, spatial position (pose), facial expression, gesture identification, and movement of a human subject, in real time. This highly versatile architecture integrates a broad variety of visual cues in order to identify the location and orientation of a listener s head within the image. 12

13 The first step in determining head position involves finding the listener s silhouette. This is accomplished by first performing motion detection based on difference images under the assumption that the cameras are fixed in space. A conventional stereo algorithm is then used to detect pixel disparity within the regions of the image that are moving. The tracking accuracy of this algorithm is typically higher than traditional stereo algorithms because we confine our search only to those regions that are moving. Once the disparities for changing pixels have been determined, a disparity histogram is used to detect disparity intervals that are characterized by strong image motion. This histogram represents the number of changing pixels as a function of their disparity (Fig. 4a). A moving person typically gives rise to a local maximum in this representation. We then construct a binary silhouette image by activating the pixels that correspond to a local maximum in the disparity histogram. A silhouette image is thus generated for every section of the image that is moving. Following the silhouette detection process we use two additional detection processes to find the location of the head within the silhouette. The first uses a look-up table to check for colors corresponding to skin tones and the second identifies regions of the silhouette that are convex (Fig. 4b). The binary outputs from both detectors are clustered and bounding boxes are computed for each cluster whose size is likely to correspond to the size of the head at the distance where the associated silhouette image was detected. The center of the head position is computed from the center of the bounding box and the disparity associated with the silhouette image based on a simple pinhole camera model. The estimates of discrete head positions are then converted to trajectories. In order to make this algorithm practical for desktop audio applications, it is necessary to account for periods of time during which the listener s head does not move. The algorithm first performs a thinning that assigns a single representative position estimate to closely-spaced estimates. This representative estimate is then checked to see if it belongs to 13

14 an existing trajectory. Under the assumption of spatio-temporal continuity, for every position estimate in frame M, the algorithm finds the closest head position determined for the previous frame (M - 1) and connects it to the current estimate. If no estimate can be found that is sufficiently close, then it is assumed that a new head appeared in the image. While there are several alternative methods for tracking humans (e.g., magnetic, ultrasound, infrared, laser), they are typically based on tethered operations or require artificial fiducials to be worn by the user. Furthermore, these methods do not offer any additional functionality to match what can be achieved with vision-based methods (e.g., face and expression recognition, ear classification). In the following section we describe a novel desktop audio system that we have developed that meets all of the design requirements and acoustical considerations described above and incorporates the visionbased tracking algorithm that allows us to modify the reproduced signal in response to listener movements. 5. DESKTOP AUDIO SYSTEM WITH HEAD TRACKING Our prototype desktop audio system is based on the MicroTheater system developed by TMH Corporation [15]. This is desktop multichannel system that was designed to provide professional sound editors with a monitoring platform that translates the experience of a dubbing stage to the desktop using a combination of acoustic, psychoacoustic, and signal processing methods. The frequency response anomalies that were present due to the local acoustical environment have been eliminated as can be seen in the frequency response plot that is very flat (± 2 db) from 30 Hz to 20 Khz. For the head-tracking experiment we used only the two front loudspeakers that are positioned on the sides of a video monitor at a distance of 45 cm from each other and 50 cm from the listener s ears (Fig. 6). The seating position height is adjusted so that the listener s ears are at the tweeter level of the loudspeakers (117 cm from the floor), which 14

15 combined with the high-horizontal-directivity design of the loudspeakers minimizes colorations in the sound due to off-axis lobing. The vision-based tracking algorithm described above has been incorporated using a standard video camera connected to an SGI Indy workstation. This tracking system provides us with the coordinates of the center of the listener s head relative to the loudspeakers and is currently capable of operating at 10 frames/sec with an error of ±1.5 cm. The goal of the experiment was to render a virtual (phantom) sound source in the center of the screen while the head of the listener is moving left or right in the plane parallel to the loudspeaker baffles. When the listener is located at the exact center position (the sweet spot), sound from each loudspeaker arrives at the corresponding ear at the exact same time (i.e., with zero ipsilateral time delay). At any other position of the listener in this plane, there is a relative time difference of arrival between the sound signals from each loudspeaker (Fig. 6). This time difference causes the perceived location of the sound image to shift towards the loudspeaker that is closer to the listener. In order to maintain proper stereophonic perspective, the ipsilateral time delay must be adjusted as the listener moves relative to the loudspeakers. The head coordinates provided from the tracking algorithm were used to determine the necessary time delay adjustment. This information is processed by a 32-bit DSP processor board (ADSP-2106x SHARC) resident in a pentium-based PC. The required relative time delay between the two channels varies from 0 µs in the center spot to 340 µs in the extreme left or right positions. The DSP board is used to delay the sound from the loudspeaker that is closest to the listener so that sound arrives with the same time difference as if the listener were positioned in the exact center between the loudspeakers. In other words, we have demonstrated stereophonic reproduction with an adaptively-optimized sweet spot. To 15

16 achieve seamless operation for continuous listener movement, a linear interpolation scheme was used to address the problem of audible clicks that result from instantaneous changes in the digital delay between the two channels. While the enhanced functionality provided by the head tracking system is promising, there are still several issues that must be addressed. We are currently in the process of identifying the computational bottlenecks of both the tracking and the audio signal processing algorithms and integrating both into a low-cost PC-based platform for real-time operation (30 frames/sec). Furthermore, we are expanding the capability of the current single-camera system to include a second camera in a stereoscopic configuration that will provide depth information and pose estimation for head rotations. 6. DESKTOP IMMERSIVE AUDIO Desktop audio systems, such as those associated with multimedia PC s, are increasingly being used for reproduction of program material that makes use of 3-D audio processing. Such processing typically relies on head-related transfer functions that have been averaged over a number of test subjects or measured using a dummy-head ear. Systems based on such non-individualized HRTF s have been shown to suffer from serious drawbacks that arise from the fact that each listener usually has characteristics that are significantly different from the average ear [16]. Furthermore, in order to map the entire three-dimensional auditory space requires a large number of tedious and timeconsuming measurements. This process must be repeated for every intended listener in order to produce accurate results. 6.1 Vision-Based Pinna Classification The human pinna is a rather sophisticated instrument that has been shown to play a key role in sound localization [17-19]. The pinna folds act as miniature reflectors that create 16

17 small time delays which in turn give rise to comb filtering effects in the frequency domain. These ridges are arranged in such a way as to optimally translate a change in angle of the incident sound into a change in the pattern of reflections. It has been demonstrated [17] that the human ear-brain interface can detect delay differences as short as 7 µs. Furthermore, as the sound source is moved towards 180 in azimuth (directly behind the listener) the pinna also acts as a low-pass filter, thus providing additional localization cues. In order to circumvent the inaccuracies in sound localization that arise from variations in the pinna characteristics of different listeners we are developing a method for pinna classification. The novelty of our approach is that is based on visual recognition of pinna physiology and selection of the appropriate set of HRTF filters. We are currently in the process of establishing a database of pinna images and associated measured directional characteristics. A picture of the pinna from every new listener will allow us to select the HRTF from our database that corresponds to the ear whose pinna shape is closest to the new ear. The basic principles used in this vision-based ear classification scheme rely on the elastic graph matching method that places graph nodes at appropriate fiducial points of the pattern [20]. Selected features from a new pinna shape can then be compared with those in the database to determine the best match. In elastic graph matching, visual features from an image are represented in the form of a vector (jet). Each component of these multidimensional vectors is the result of the convolution of a local grey level value with a Gabor wavelet of a particular frequency and orientation. Jets are calculated at several different points from a modelgraph that is chosen to represent the object to be classified (Fig. 8). The modelgraph we designed for the pinna consists of 19 jets, each representing one key geometrical feature on the pinna. We used 5 frequencies and 8 orientations for the Gabor wavelets for a total of 40 components in each jet. 17

18 Comparison among jets and graphs is performed through a similarity function that is defined as the normalized dot product of two jets over the entire modelgraph. This gives us a method for comparison that is robust to changes in illumination and contrast. The first step in the procedure we used was to manually place nodes on key features of 13 pinnae to create appropriate modelgraphs (Fig. 7). These modelgraphs were then used to automatically find the modelgraph of any new pinna. Initial results have shown successful matching of ears from unknown listeners to those already in our database including two artificial ears from the KEMAR dummy-head system. We are currently in the process of performing listening tests to determine the improvement in localization that results from the pinna matching. We are also working on an improved version of the matching method that will select transfer function characteristics from several stored pinna to best match the corresponding characteristics of the new pinna. An appropriate set of weighting factors will then be determined to form a synthetic HRTF that closely resembles that of the new listener. 7. CONCLUSIONS We have examined the acoustical, psychoacoustical, and signal processing design requirements for implementing desktop audio systems for high fidelity sound reproduction. We proposed a set of solutions that pertain to the loudspeaker design in order to place the listener in the direct dominant field, adjustments for reflecting and diffracting surfaces in the local acoustical environment, and parametric equalization that is not based on 1/3 octave bands. We also presented a desktop audio system design that incorporates a novel listener-tracking algorithm based on principles of computer vision. This novel system adjusts the output from each loudspeaker in real time based on the location of the listener s head. Finally, we presented a method for pinna classification based on elastic graph matching that can be used to select the appropriate set of head-related transfer function 18

19 filters by performing a visual match with one of the measured pinnae in our database. We are currently working on implementing a system that incorporates all of these features to render multichannel program material from just two loudspeakers for a moving listener. 19

20 8. ACKNOWLEDGMENTS The authors would like to thank Prof. Christoph von der Malsburg from the USC Laboratory for Computational and Biological Vision for his guidance and support. This research has been funded in part by the Integrated Media Systems Center, a National Science Foundation Engineering Research Center with additional support from the Annenberg Center for Communication at USC and the California Trade and Commerce Agency. 20

21 8. REFERENCES 1 B. Shinn-Cunningham, Adapting to Discrepant Information in Multimedia Displays, 134th Meeting of the Acoustical Society of America, San Diego, California, C. Kyriakakis, Fundamental and Technological Limitations of Immersive Audio Systems, IEEE Proceedings: Special Issue on Multimedia Signal Processing, (to appear June, 1998). 3 F. L. Wightman, D. J. Kistler, and M. Arruda, Perceptual Consequences of Engineering Compromises in Synthesis of Virtual Auditory Objects, Journal of the Acoustical Society of America, 101, 1992, D. R. Begault, Challenges to the successful implementation of 3-D sound, Journal of the Audio Engineering Society, 39, 1991, T. Holman, Monitoring Sound in the One-Person Environment, SMPTE Journal, 106, 1997, F. E. Toole, Loudspeaker measurements and their relationship to listener preferences, Journal of the Audio Engineering Society, 34, 1986, S. Bech, Perception of timbre of reproduced sound in small rooms: influence of room and loudspeaker position, Journal of the Audio Engineering Society, 42, 1994,

22 8 S. E. Olive and F. E. Toole, The Detection of Reflections in Typical Rooms, Journal of the Audio Engineering Society, 37, 1989, R. Walker, Early Reflections in Studio Control Rooms: The Results from the First Controlled Image Design Installations, 96th Meeting of the Audio Engineering Society, Amsterdam, T. Holman, Report on Mixing Studios Sound Quality, Journal of the Japan Audio Society, F. E. Toole, Subjective measurements of loudspeaker sound quality and listener performance, Journal of the Audio Engineering Society, 33, 1985, T. Holman, New Factors in Sound for Cinema and Television, Journal of the Audio Engineering Society, 39, 1991, M. R. Schroeder and B. S. Atal, Computer Simulation of Sound Transmission in Rooms, IEEE International Convention Record, 7, D. H. Cooper and J. L. Bauck, Prospects for Transaural Recording, Journal of the Audio Engineering Society, 37, 1989, TMH Corporation, 16 E. M. Wenzel, M. Arruda, and D. J. Kistler, Localization using nonindividualized head-related transfer functions, Journal of the Acoustical Society of America, 94, 1993,

23 17 J. Hebrank and D. Wright, Spectral Cues Used in the Localization of Sound Sources in the Median Plane, Journal of the Acoustical Society of America, 56, 1974, J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, Revised Edition, MIT Press, Cambridge, Massachusetts, D. W. Batteau, The role of the pinna in human localization, Proceedings of the Royal Society of London, B168, 1967, L. Wiskott, J. M. Fellous, N. Krueger, and C. von der Malsburg, Face Recognition by Elastic Bunch Graph Matching, Institute for Neuroinformatics, Bochum, Tech. Report No. 8,

24 Figure Legends Figure 1. Desktop sound system time response that meets the psychoacoustic requirements for low-level early reflections. The spectrum level of the reflected sound is more than 15 db below that of the direct sound. Figure 2. Desktop sound system time response that violates the requirements for lowlevel early reflections. The early reflection peaks at 1.2 ms and 6 ms give rise to a spectrum level that is above the 15 db criterion. Figure 3. Frequency response of a desktop loudspeaker system that clearly shows the effects of the local acoustical environment. There are large peaks and dips that give rise to significant audible distortion in the reproduced sound. Also note that there is no bass reproduction below 150 Hz. Figure 4. (a) The first step in the vision-based tracking algorithm involves motion detection from a disparity image. A disparity histogram is generated and the local maxima are used to generate a silhouette of the moving images. Figure 4. (b) In the second stage of the vision-based algorithm skin-colored and convex regions are identified. The results of this search are combined with the motion results to estimate the position of the head. Figure 5. Frequency response of desktop loudspeaker system designed using the direct-path dominant and correct low-frequency response guidelines described in the text. The response has been corrected with minimal parametric equalization and is relatively flat (± 2 db) from 30 Hz to 20 KHz. The solid line represents the on-axis response and the dotted line represents the 10 off-axis (vertical) response. 24

25 Figure 6. The relative delay in the time of arrival of the direct sound from each loudspeaker to each (same-side) ear as a function of head position in the horizontal plane parallel to the loudspeakers. The geometry of our experimental desktop sound system is shown in the inset. Figure 7. Two modelgraphs from our pinna database are shown here. The nodes correspond to the location of the jets that cary the Gabor wavelet convolution information. 25

Sound source localization and its use in multimedia applications

Sound source localization and its use in multimedia applications Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,

More information

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Spatial Audio Reproduction: Towards Individualized Binaural Sound Spatial Audio Reproduction: Towards Individualized Binaural Sound WILLIAM G. GARDNER Wave Arts, Inc. Arlington, Massachusetts INTRODUCTION The compact disc (CD) format records audio with 16-bit resolution

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

Introduction. 1.1 Surround sound

Introduction. 1.1 Surround sound Introduction 1 This chapter introduces the project. First a brief description of surround sound is presented. A problem statement is defined which leads to the goal of the project. Finally the scope of

More information

Speech Compression. Application Scenarios

Speech Compression. Application Scenarios Speech Compression Application Scenarios Multimedia application Live conversation? Real-time network? Video telephony/conference Yes Yes Business conference with data sharing Yes Yes Distance learning

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST PACS: 43.25.Lj M.Jones, S.J.Elliott, T.Takeuchi, J.Beer Institute of Sound and Vibration Research;

More information

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1.

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1. EBU Tech 3276-E Listening conditions for the assessment of sound programme material Revised May 2004 Multichannel sound EBU UER european broadcasting union Geneva EBU - Listening conditions for the assessment

More information

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Sebastian Merchel and Stephan Groth Chair of Communication Acoustics, Dresden University

More information

6-channel recording/reproduction system for 3-dimensional auralization of sound fields

6-channel recording/reproduction system for 3-dimensional auralization of sound fields Acoust. Sci. & Tech. 23, 2 (2002) TECHNICAL REPORT 6-channel recording/reproduction system for 3-dimensional auralization of sound fields Sakae Yokoyama 1;*, Kanako Ueno 2;{, Shinichi Sakamoto 2;{ and

More information

The Official Magazine of the National Association of Theatre Owners

The Official Magazine of the National Association of Theatre Owners $6.95 JULY 2016 The Official Magazine of the National Association of Theatre Owners TECH TALK THE PRACTICAL REALITIES OF IMMERSIVE AUDIO What to watch for when considering the latest in sound technology

More information

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING A.VARLA, A. MÄKIVIRTA, I. MARTIKAINEN, M. PILCHNER 1, R. SCHOUSTAL 1, C. ANET Genelec OY, Finland genelec@genelec.com 1 Pilchner Schoustal Inc, Canada

More information

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS Myung-Suk Song #1, Cha Zhang 2, Dinei Florencio 3, and Hong-Goo Kang #4 # Department of Electrical and Electronic, Yonsei University Microsoft Research 1 earth112@dsp.yonsei.ac.kr,

More information

From time to time it is useful even for an expert to give a thought to the basics of sound reproduction. For instance, what the stereo is all about?

From time to time it is useful even for an expert to give a thought to the basics of sound reproduction. For instance, what the stereo is all about? HIFI FUNDAMENTALS, WHAT THE STEREO IS ALL ABOUT Gradient ltd.1984-2000 From the beginning of Gradient Ltd. some fundamental aspects of loudspeaker design has frequently been questioned by our R&D Director

More information

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction S.B. Nielsen a and A. Celestinos b a Aalborg University, Fredrik Bajers Vej 7 B, 9220 Aalborg Ø, Denmark

More information

The analysis of multi-channel sound reproduction algorithms using HRTF data

The analysis of multi-channel sound reproduction algorithms using HRTF data The analysis of multichannel sound reproduction algorithms using HRTF data B. Wiggins, I. PatersonStephens, P. Schillebeeckx Processing Applications Research Group University of Derby Derby, United Kingdom

More information

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA Surround: The Current Technological Situation David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 www.world.std.com/~griesngr There are many open questions 1. What is surround sound 2. Who will listen

More information

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques:

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques: Multichannel Audio Technologies More on Surround Sound Microphone Techniques: In the last lecture we focused on recording for accurate stereophonic imaging using the LCR channels. Today, we look at the

More information

Accurate sound reproduction from two loudspeakers in a living room

Accurate sound reproduction from two loudspeakers in a living room Accurate sound reproduction from two loudspeakers in a living room Siegfried Linkwitz 13-Apr-08 (1) D M A B Visual Scene 13-Apr-08 (2) What object is this? 19-Apr-08 (3) Perception of sound 13-Apr-08 (4)

More information

Finding the Prototype for Stereo Loudspeakers

Finding the Prototype for Stereo Loudspeakers Finding the Prototype for Stereo Loudspeakers The following presentation slides from the AES 51st Conference on Loudspeakers and Headphones summarize my activities and observations for the design of loudspeakers

More information

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY

DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY DESIGN OF VOICE ALARM SYSTEMS FOR TRAFFIC TUNNELS: OPTIMISATION OF SPEECH INTELLIGIBILITY Dr.ir. Evert Start Duran Audio BV, Zaltbommel, The Netherlands The design and optimisation of voice alarm (VA)

More information

Sound Processing Technologies for Realistic Sensations in Teleworking

Sound Processing Technologies for Realistic Sensations in Teleworking Sound Processing Technologies for Realistic Sensations in Teleworking Takashi Yazu Makoto Morito In an office environment we usually acquire a large amount of information without any particular effort

More information

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4 SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................

More information

Force versus Frequency Figure 1.

Force versus Frequency Figure 1. An important trend in the audio industry is a new class of devices that produce tactile sound. The term tactile sound appears to be a contradiction of terms, in that our concept of sound relates to information

More information

Multichannel Audio In Cars (Tim Nind)

Multichannel Audio In Cars (Tim Nind) Multichannel Audio In Cars (Tim Nind) Presented by Wolfgang Zieglmeier Tonmeister Symposium 2005 Page 1 Reproducing Source Position and Space SOURCE SOUND Direct sound heard first - note different time

More information

Sound Source Localization using HRTF database

Sound Source Localization using HRTF database ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,

More information

Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ

Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ Author Abstract This paper discusses the concept of producing surround sound with

More information

Audio Engineering Society. Convention Paper. Presented at the 115th Convention 2003 October New York, New York

Audio Engineering Society. Convention Paper. Presented at the 115th Convention 2003 October New York, New York Audio Engineering Society Convention Paper Presented at the 115th Convention 2003 October 10 13 New York, New York This convention paper has been reproduced from the author's advance manuscript, without

More information

Spatial audio is a field that

Spatial audio is a field that [applications CORNER] Ville Pulkki and Matti Karjalainen Multichannel Audio Rendering Using Amplitude Panning Spatial audio is a field that investigates techniques to reproduce spatial attributes of sound

More information

RD75, RD50, RD40, RD28.1 Planar magnetic transducers with true line source characteristics

RD75, RD50, RD40, RD28.1 Planar magnetic transducers with true line source characteristics RD75, RD50, RD40, RD28.1 Planar magnetic transducers true line source characteristics The RD line of planar-magnetic ribbon drivers represents the ultimate thin film diaphragm technology. The RD drivers

More information

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS 20-21 September 2018, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2018) 20-21 September 2018, Bulgaria INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR

More information

Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal

Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal Aalborg Universitet Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal Published in: Acustica United with Acta Acustica

More information

MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES. Toni Hirvonen, Miikka Tikander, and Ville Pulkki

MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES. Toni Hirvonen, Miikka Tikander, and Ville Pulkki MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES Toni Hirvonen, Miikka Tikander, and Ville Pulkki Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing P.O. box 3, FIN-215 HUT,

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Lee, Hyunkook Capturing and Rendering 360º VR Audio Using Cardioid Microphones Original Citation Lee, Hyunkook (2016) Capturing and Rendering 360º VR Audio Using Cardioid

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Moore, David J. and Wakefield, Jonathan P. Surround Sound for Large Audiences: What are the Problems? Original Citation Moore, David J. and Wakefield, Jonathan P.

More information

NEAR-FIELD VIRTUAL AUDIO DISPLAYS

NEAR-FIELD VIRTUAL AUDIO DISPLAYS NEAR-FIELD VIRTUAL AUDIO DISPLAYS Douglas S. Brungart Human Effectiveness Directorate Air Force Research Laboratory Wright-Patterson AFB, Ohio Abstract Although virtual audio displays are capable of realistically

More information

ROOM SHAPE AND SIZE ESTIMATION USING DIRECTIONAL IMPULSE RESPONSE MEASUREMENTS

ROOM SHAPE AND SIZE ESTIMATION USING DIRECTIONAL IMPULSE RESPONSE MEASUREMENTS ROOM SHAPE AND SIZE ESTIMATION USING DIRECTIONAL IMPULSE RESPONSE MEASUREMENTS PACS: 4.55 Br Gunel, Banu Sonic Arts Research Centre (SARC) School of Computer Science Queen s University Belfast Belfast,

More information

Multi-Loudspeaker Reproduction: Surround Sound

Multi-Loudspeaker Reproduction: Surround Sound Multi-Loudspeaker Reproduction: urround ound Understanding Dialog? tereo film L R No Delay causes echolike disturbance Yes Experience with stereo sound for film revealed that the intelligibility of dialog

More information

LINE ARRAY Q&A ABOUT LINE ARRAYS. Question: Why Line Arrays?

LINE ARRAY Q&A ABOUT LINE ARRAYS. Question: Why Line Arrays? Question: Why Line Arrays? First, what s the goal with any quality sound system? To provide well-defined, full-frequency coverage as consistently as possible from seat to seat. However, traditional speaker

More information

Enhancing 3D Audio Using Blind Bandwidth Extension

Enhancing 3D Audio Using Blind Bandwidth Extension Enhancing 3D Audio Using Blind Bandwidth Extension (PREPRINT) Tim Habigt, Marko Ðurković, Martin Rothbucher, and Klaus Diepold Institute for Data Processing, Technische Universität München, 829 München,

More information

Psychoacoustic Cues in Room Size Perception

Psychoacoustic Cues in Room Size Perception Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany 6084 This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION ARCHIVES OF ACOUSTICS 33, 4, 413 422 (2008) VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION Michael VORLÄNDER RWTH Aachen University Institute of Technical Acoustics 52056 Aachen,

More information

HRIR Customization in the Median Plane via Principal Components Analysis

HRIR Customization in the Median Plane via Principal Components Analysis 한국소음진동공학회 27 년춘계학술대회논문집 KSNVE7S-6- HRIR Customization in the Median Plane via Principal Components Analysis 주성분분석을이용한 HRIR 맞춤기법 Sungmok Hwang and Youngjin Park* 황성목 박영진 Key Words : Head-Related Transfer

More information

Sonnet. we think differently!

Sonnet. we think differently! Sonnet Sonnet T he completion of a new loudspeaker series from bottom to top is normally not a difficult task, instead it is a hard job the reverse the path, because the more you go away from the full

More information

Computational Perception. Sound localization 2

Computational Perception. Sound localization 2 Computational Perception 15-485/785 January 22, 2008 Sound localization 2 Last lecture sound propagation: reflection, diffraction, shadowing sound intensity (db) defining computational problems sound lateralization

More information

Waves Nx VIRTUAL REALITY AUDIO

Waves Nx VIRTUAL REALITY AUDIO Waves Nx VIRTUAL REALITY AUDIO WAVES VIRTUAL REALITY AUDIO THE FUTURE OF AUDIO REPRODUCTION AND CREATION Today s entertainment is on a mission to recreate the real world. Just as VR makes us feel like

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 1pAAa: Advanced Analysis of Room Acoustics:

More information

Monitor Setup Guide The right monitors. The correct setup. Proper sound.

Monitor Setup Guide The right monitors. The correct setup. Proper sound. Monitor Setup Guide 2017 The right monitors. The correct setup. Proper sound. Table of contents Genelec Key Technologies 3 What is a monitor? 4 What is a reference monitor? 4 Selecting the correct monitors

More information

RM28ac. Self-Powered Dual 8 inch Coaxial Reference Monitor. product specification. Performance Specifications 1

RM28ac. Self-Powered Dual 8 inch Coaxial Reference Monitor. product specification. Performance Specifications 1 RM28ac Self-Powered Dual 8 inch Coaxial Reference Monitor Performance Specifications 1 Operating Mode Self-Powered, w/ On-Board DSP Operating Range 2 40 Hz to 24 khz Nominal Beamwidth (rotatable) 90 x

More information

Technical Note Vol. 1, No. 10 Use Of The 46120K, 4671 OK, And 4660 Systems in Fixed instaiiation Sound Reinforcement

Technical Note Vol. 1, No. 10 Use Of The 46120K, 4671 OK, And 4660 Systems in Fixed instaiiation Sound Reinforcement Technical Note Vol. 1, No. 10 Use Of The 46120K, 4671 OK, And 4660 Systems in Fixed instaiiation Sound Reinforcement Introduction: For many small and medium scale sound reinforcement applications, preassembled

More information

A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology

A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology Joe Hayes Chief Technology Officer Acoustic3D Holdings Ltd joe.hayes@acoustic3d.com

More information

From Binaural Technology to Virtual Reality

From Binaural Technology to Virtual Reality From Binaural Technology to Virtual Reality Jens Blauert, D-Bochum Prominent Prominent Features of of Binaural Binaural Hearing Hearing - Localization Formation of positions of the auditory events (azimuth,

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations György Wersényi Széchenyi István University, Hungary. József Répás Széchenyi István University, Hungary. Summary

More information

Technical Notes Volume 1, Number 25. Using HLA 4895 modules in arrays: system controller guidelines

Technical Notes Volume 1, Number 25. Using HLA 4895 modules in arrays: system controller guidelines Technical Notes Volume 1, Number 25 Using HLA 4895 modules in arrays: system controller guidelines Introduction: The HLA 4895 3-way module has been designed for use in conjunction with the HLA 4897 bass

More information

6 TH GENERATION PROFESSIONAL SOUND FOR CONSUMER ELECTRONICS

6 TH GENERATION PROFESSIONAL SOUND FOR CONSUMER ELECTRONICS 6 TH GENERATION PROFESSIONAL SOUND FOR CONSUMER ELECTRONICS Waves MaxxAudio is a suite of advanced audio enhancement tools that brings award-winning professional technologies to consumer electronics devices.

More information

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES ROOM AND CONCERT HALL ACOUSTICS The perception of sound by human listeners in a listening space, such as a room or a concert hall is a complicated function of the type of source sound (speech, oration,

More information

Acoustics Research Institute

Acoustics Research Institute Austrian Academy of Sciences Acoustics Research Institute Spatial SpatialHearing: Hearing: Single SingleSound SoundSource Sourcein infree FreeField Field Piotr PiotrMajdak Majdak&&Bernhard BernhardLaback

More information

LOW FREQUENCY SOUND IN ROOMS

LOW FREQUENCY SOUND IN ROOMS Room boundaries reflect sound waves. LOW FREQUENCY SOUND IN ROOMS For low frequencies (typically where the room dimensions are comparable with half wavelengths of the reproduced frequency) waves reflected

More information

THE TEMPORAL and spectral structure of a sound signal

THE TEMPORAL and spectral structure of a sound signal IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 1, JANUARY 2005 105 Localization of Virtual Sources in Multichannel Audio Reproduction Ville Pulkki and Toni Hirvonen Abstract The localization

More information

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN WAVELET-BASE SPECTRAL SMOOTHING FOR HEA-RELATE TRANSFER FUNCTION FILTER ESIGN HUSEYIN HACIHABIBOGLU, BANU GUNEL, AN FIONN MURTAGH Sonic Arts Research Centre (SARC), Queen s University Belfast, Belfast,

More information

ALTERNATING CURRENT (AC)

ALTERNATING CURRENT (AC) ALL ABOUT NOISE ALTERNATING CURRENT (AC) Any type of electrical transmission where the current repeatedly changes direction, and the voltage varies between maxima and minima. Therefore, any electrical

More information

O P S I. ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis )

O P S I. ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis ) O P S I ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis ) A Hybrid WFS / Phantom Source Solution to avoid Spatial aliasing (patentiert 2002)

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Psychological and Physiological Acoustics Session 2aPPa: Binaural Hearing

More information

Microphone a transducer that converts one type of energy (sound waves) into another corresponding form of energy (electric signal).

Microphone a transducer that converts one type of energy (sound waves) into another corresponding form of energy (electric signal). 1 Professor Calle ecalle@mdc.edu www.drcalle.com MUM 2600 Microphone Notes Microphone a transducer that converts one type of energy (sound waves) into another corresponding form of energy (electric signal).

More information

Analysis of Frontal Localization in Double Layered Loudspeaker Array System

Analysis of Frontal Localization in Double Layered Loudspeaker Array System Proceedings of 20th International Congress on Acoustics, ICA 2010 23 27 August 2010, Sydney, Australia Analysis of Frontal Localization in Double Layered Loudspeaker Array System Hyunjoo Chung (1), Sang

More information

Listening with Headphones

Listening with Headphones Listening with Headphones Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back Substantial individual differences Most evident in elevation

More information

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings.

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings. demo Acoustics II: recording Kurt Heutschi 2013-01-18 demo Stereo recording: Patent Blumlein, 1931 demo in a real listening experience in a room, different contributions are perceived with directional

More information

Multiple Sound Sources Localization Using Energetic Analysis Method

Multiple Sound Sources Localization Using Energetic Analysis Method VOL.3, NO.4, DECEMBER 1 Multiple Sound Sources Localization Using Energetic Analysis Method Hasan Khaddour, Jiří Schimmel Department of Telecommunications FEEC, Brno University of Technology Purkyňova

More information

A White Paper on Danley Sound Labs Tapped Horn and Synergy Horn Technologies

A White Paper on Danley Sound Labs Tapped Horn and Synergy Horn Technologies Tapped Horn (patent pending) Horns have been used for decades in sound reinforcement to increase the loading on the loudspeaker driver. This is done to increase the power transfer from the driver to the

More information

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany Audio Engineering Society Convention Paper Presented at the 16th Convention 9 May 7 Munich, Germany The papers at this Convention have been selected on the basis of a submitted abstract and extended precis

More information

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION T Spenceley B Wiggins University of Derby, Derby, UK University of Derby,

More information

A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February :54

A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February :54 A Digital Signal Processor for Musicians and Audiophiles Published on Monday, 09 February 2009 09:54 The main focus of hearing aid research and development has been on the use of hearing aids to improve

More information

[Q] DEFINE AUDIO AMPLIFIER. STATE ITS TYPE. DRAW ITS FREQUENCY RESPONSE CURVE.

[Q] DEFINE AUDIO AMPLIFIER. STATE ITS TYPE. DRAW ITS FREQUENCY RESPONSE CURVE. TOPIC : HI FI AUDIO AMPLIFIER/ AUDIO SYSTEMS INTRODUCTION TO AMPLIFIERS: MONO, STEREO DIFFERENCE BETWEEN STEREO AMPLIFIER AND MONO AMPLIFIER. [Q] DEFINE AUDIO AMPLIFIER. STATE ITS TYPE. DRAW ITS FREQUENCY

More information

The Spatial Soundscape. James L. Barbour Swinburne University of Technology, Melbourne, Australia

The Spatial Soundscape. James L. Barbour Swinburne University of Technology, Melbourne, Australia The Spatial Soundscape 1 James L. Barbour Swinburne University of Technology, Melbourne, Australia jbarbour@swin.edu.au Abstract While many people have sought to capture and document sounds for posterity,

More information

The Use of 3-D Audio in a Synthetic Environment: An Aural Renderer for a Distributed Virtual Reality System

The Use of 3-D Audio in a Synthetic Environment: An Aural Renderer for a Distributed Virtual Reality System The Use of 3-D Audio in a Synthetic Environment: An Aural Renderer for a Distributed Virtual Reality System Stephen Travis Pope and Lennart E. Fahlén DSLab Swedish Institute for Computer Science (SICS)

More information

Tower Mains. A new breed of Main Monitors

Tower Mains. A new breed of Main Monitors Tower Mains A new breed of Main Monitors / TMS 36 In the search for precision it was decided to apply closed box designs only as they principally allow the best approximation to ideal transient behaviour.

More information

Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction.

Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction. Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction Eiichi Miyasaka 1 1 Introduction Large-screen HDTV sets with the screen sizes over

More information

A virtual headphone based on wave field synthesis

A virtual headphone based on wave field synthesis Acoustics 8 Paris A virtual headphone based on wave field synthesis K. Laumann a,b, G. Theile a and H. Fastl b a Institut für Rundfunktechnik GmbH, Floriansmühlstraße 6, 8939 München, Germany b AG Technische

More information

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time.

Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. 2. Physical sound 2.1 What is sound? Sound is the human ear s perceived effect of pressure changes in the ambient air. Sound can be modeled as a function of time. Figure 2.1: A 0.56-second audio clip of

More information

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois. UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab 3D and Virtual Sound Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Overview Human perception of sound and space ITD, IID,

More information

Fundamentals of Digital Audio *

Fundamentals of Digital Audio * Digital Media The material in this handout is excerpted from Digital Media Curriculum Primer a work written by Dr. Yue-Ling Wong (ylwong@wfu.edu), Department of Computer Science and Department of Art,

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 5: 12 Feb 2009. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence

More information

THE RELATIVE IMPORTANCE OF PICTORIAL AND NONPICTORIAL DISTANCE CUES FOR DRIVER VISION. Michael J. Flannagan Michael Sivak Julie K.

THE RELATIVE IMPORTANCE OF PICTORIAL AND NONPICTORIAL DISTANCE CUES FOR DRIVER VISION. Michael J. Flannagan Michael Sivak Julie K. THE RELATIVE IMPORTANCE OF PICTORIAL AND NONPICTORIAL DISTANCE CUES FOR DRIVER VISION Michael J. Flannagan Michael Sivak Julie K. Simpson The University of Michigan Transportation Research Institute Ann

More information

ECC419 IMAGE PROCESSING

ECC419 IMAGE PROCESSING ECC419 IMAGE PROCESSING INTRODUCTION Image Processing Image processing is a subclass of signal processing concerned specifically with pictures. Digital Image Processing, process digital images by means

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

EXPERIMENTAL BILATERAL CONTROL TELEMANIPULATION USING A VIRTUAL EXOSKELETON

EXPERIMENTAL BILATERAL CONTROL TELEMANIPULATION USING A VIRTUAL EXOSKELETON EXPERIMENTAL BILATERAL CONTROL TELEMANIPULATION USING A VIRTUAL EXOSKELETON Josep Amat 1, Alícia Casals 2, Manel Frigola 2, Enric Martín 2 1Robotics Institute. (IRI) UPC / CSIC Llorens Artigas 4-6, 2a

More information

Design of a Line Array Point Source Loudspeaker System

Design of a Line Array Point Source Loudspeaker System Design of a Line Array Point Source Loudspeaker System -by Charlie Hughes 6430 Business Park Loop Road Park City, UT 84098-6121 USA // www.soundtube.com // 435.647.9555 22 May 2013 Charlie Hughes The Design

More information

APPLICATIONS OF DYNAMIC DIFFUSE SIGNAL PROCESSING IN SOUND REINFORCEMENT AND REPRODUCTION

APPLICATIONS OF DYNAMIC DIFFUSE SIGNAL PROCESSING IN SOUND REINFORCEMENT AND REPRODUCTION APPLICATIONS OF DYNAMIC DIFFUSE SIGNAL PROCESSING IN SOUND REINFORCEMENT AND REPRODUCTION J Moore AJ Hill Department of Electronics, Computing and Mathematics, University of Derby, UK Department of Electronics,

More information

WHY BOTHER WITH STEREO?

WHY BOTHER WITH STEREO? By Frank McClatchie: FM SYSTEMS, INC. Tel: 1-800-235-6960 WHY BOTHER WITH STEREO? Basically Because your subscribers expect it! They are so used to their music and movies being in stereo, that if their

More information

Envelopment and Small Room Acoustics

Envelopment and Small Room Acoustics Envelopment and Small Room Acoustics David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 Copyright 9/21/00 by David Griesinger Preview of results Loudness isn t everything! At least two additional perceptions:

More information

EECS 452, W.03 DSP Project Proposals: HW#5 James Glettler

EECS 452, W.03 DSP Project Proposals: HW#5 James Glettler EECS 45, W.03 Project Proposals: HW#5 James Glettler James (at) ElysianAudio.com - jglettle (at) umich.edu - www.elysianaudio.com Proposal: Automated Adaptive Room/System Equalization System Develop a

More information

SOUND 1 -- ACOUSTICS 1

SOUND 1 -- ACOUSTICS 1 SOUND 1 -- ACOUSTICS 1 SOUND 1 ACOUSTICS AND PSYCHOACOUSTICS SOUND 1 -- ACOUSTICS 2 The Ear: SOUND 1 -- ACOUSTICS 3 The Ear: The ear is the organ of hearing. SOUND 1 -- ACOUSTICS 4 The Ear: The outer ear

More information

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner.

Perception of pitch. Definitions. Why is pitch important? BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb A. Faulkner. Perception of pitch BSc Audiology/MSc SHS Psychoacoustics wk 4: 7 Feb 2008. A. Faulkner. See Moore, BCJ Introduction to the Psychology of Hearing, Chapter 5. Or Plack CJ The Sense of Hearing Lawrence Erlbaum,

More information

FX inch Coaxial Vocal Monitor. product specification SERIES. Performance Specifications 1

FX inch Coaxial Vocal Monitor. product specification SERIES. Performance Specifications 1 FX896 8 inch Coaxial Vocal Monitor Performance Specifications 1 Operating Mode Single-amplified w/ DSP Operating Range 2 94 Hz to 21 khz SERIES Nominal Beamwidth (rotatable) 90 x 60 Transducers HF/LF:

More information

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE Copyright SFA - InterNoise 2000 1 inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE I-INCE Classification: 7.2 MICROPHONE ARRAY

More information

Technique for the Derivation of Wide Band Room Impulse Response

Technique for the Derivation of Wide Band Room Impulse Response Technique for the Derivation of Wide Band Room Impulse Response PACS Reference: 43.55 Behler, Gottfried K.; Müller, Swen Institute on Technical Acoustics, RWTH, Technical University of Aachen Templergraben

More information

Potential and Limits of a High-Density Hemispherical Array of Loudspeakers for Spatial Hearing and Auralization Research

Potential and Limits of a High-Density Hemispherical Array of Loudspeakers for Spatial Hearing and Auralization Research Journal of Applied Mathematics and Physics, 2015, 3, 240-246 Published Online February 2015 in SciRes. http://www.scirp.org/journal/jamp http://dx.doi.org/10.4236/jamp.2015.32035 Potential and Limits of

More information

A Java Virtual Sound Environment

A Java Virtual Sound Environment A Java Virtual Sound Environment Proceedings of the 15 th Annual NACCQ, Hamilton New Zealand July, 2002 www.naccq.ac.nz ABSTRACT Andrew Eales Wellington Institute of Technology Petone, New Zealand andrew.eales@weltec.ac.nz

More information

Active Control of Energy Density in a Mock Cabin

Active Control of Energy Density in a Mock Cabin Cleveland, Ohio NOISE-CON 2003 2003 June 23-25 Active Control of Energy Density in a Mock Cabin Benjamin M. Faber and Scott D. Sommerfeldt Department of Physics and Astronomy Brigham Young University N283

More information