Analysis and Design of Multichannel Systems for Perceptual Sound Field Reconstruction

Size: px
Start display at page:

Download "Analysis and Design of Multichannel Systems for Perceptual Sound Field Reconstruction"

Transcription

1 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST Analysis and Design of Multichannel Systems for Perceptual Sound Field Reconstruction Enzo De Sena, Student Member, IEEE, Hüseyin Hacıhabiboğlu, Senior Member, IEEE, and Zoran Cvetković, Senior Member, IEEE Abstract This paper presents a systematic framework for the analysis and design of circular multichannel surround sound systems. Objective analysis based on the concept of active intensity fields shows that for stable rendition of monochromatic plane wavesitisbeneficial to render each such wave by no morethan twochannels.basedonthatfinding, we propose a methodology for the design of circular microphone arrays, in the same configuration as the corresponding loudspeaker system, which aims to capture inter-channel time and intensity differences that ensure accurate rendition of the auditory perspective. The methodology is applicable to regular and irregular microphone/speaker layouts, and a wide range of microphone array radii, including the special case of coincident arrays which corresponds to intensity-based systems. Several design examples, involving first and higher-order microphones are presented. Results of formal listening tests suggest that the proposed design methodology achieves a performance comparable to prior art in the center of the loudspeaker array and a more graceful degradation away from the center. Index Terms Active intensity, microphone array, microphone directivity, multichannel audio, spatial hearing, surround sound recording, tangent panning law, time-intensity trading. I. INTRODUCTION AND MOTIVATION GENERATING the experience of spatial sound can be achieved in a number of ways. From a practical standpoint one aims to provide the most convincing experience with the minimum equipment and channels. Comparing different methods, at one extreme there are binaural techniques, which deliver a convincing experience over two channels by presenting necessary binaural cues [4]. Binaural presentation works best over headphones, but the perception of the reproduced field is severely compromised by head movements and by the mismatch between the individual HRTFs of the listener and the presented binaural cues. Alternatively, two loudspeakers with cross-talk cancellation can be used [5], but the sweet listening spot is very narrow. At the other extreme Manuscript received August 02, 2012; revised December 30, 2012 and April 02, 2013; accepted April 10, Date of publication April 25, 2013; date of current version May 08, This work was supported by the EPSRC under Grant EP/F001142/1. This work was done while H. Hacıhabiboğlu was with the Department of Informatics, King s College London. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Boaz Rafaely. E. De Sena and Z. Cvetković are with the Institute of Telecommunications, King s College London, Strand, London, WC2R 2LS, U.K. ( enzo.desena@kcl.ac.uk; zoran.cvetkovic@kcl.ac.uk). H. Hacıhabiboğlu is with the Informatics Institute, Middle East Technical University, Ankara, 06531, Turkey ( huseyin@ii.metu.edu.tr). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TASL there is wave-field synthesis (WFS), which typically uses a large number of channels to accurately reproduce the wave front generated by a virtual source [6], thus providing a wide sweet listening area. Higher-order Ambisonics (HOA) [7] [9] is capable of achieving similar effects by reconstructing spherical harmonics of the pressure field in the center of the listening area. In between these extremes there are systems with five or more channels. Such systems do not possess sufficient number of channels to simulate physical wave fronts without spatial aliasing, or to reconstruct ear signals accurately for listeners at multiple locations. Instead these systems must rely to a large degree on perceptual effects, most notably summing localization, to generate a spatially stable experience of the desired auditory scene. Summing localization describes the effect by which two loudspeakers radiating identical signals with given inter-channel level and time differences (ICLD and ICTD) result in a single, fused, auditory event [4]. The perceived location of this auditory event depends on both the ICLD and ICTD applied. Current commercial multichannel recordings rely to a great extent on summing localization. However, they are yet to achieve the spatial realism that is possible with the available number of channels. That is in part due to design legacy of sound production for the film industry, the main focus of which is creating attention grabbing effects and providing a general ambiance feel. Auralization is usually achieved in a synthetic manner through intensity panning between pairs of channels. Then ambient information is usually presented from the rear surround channels and directional localization information from the front channels [10]. This makes perfect sense when one is expected to be viewing a frontally placed screen, since localizing away from the screen breaks the audio-visual illusion. While a pleasing listening experience can be achieved most of the time using such systems, they do not necessarily yield presentations which are coherent with the acoustics of the performance venue or the desired virtual space. Due to this incoherence, the spatial auditory experience lacks realism and fidelity. A number of multichannel recording techniques which aim to overcome these limitations and provide a coherent auditory perspective in a wider area have been proposed [11] [18]. Theile proposes a class of non-coincident microphone arrays for recording frontal scenes in 3/2-stereo ITU-R standard format, which is known as optimized cardioid triangle (OCT) [12, p.255]. Williams describes more general guidelines for appropriately arranging standard first-order studio microphones [12] given a loudspeaker layout and a desired coverage angle /$ IEEE

2 1654 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 [13]. Johnston et al. propose the perceptual sound field reconstruction (PSR) [14] [16] scheme, which aims to render convincing auditory perspective by capturing interaural time and level differences. All these methods are still mainly a result of empirical observation and hands-on tuning. This work develops further insight into the underlying physical and perceptual phenomena, and based on that refines Johnston s PSR approach, and extends it to a more general and systematic framework. We depart from the original PSR idea of capturing interaural time and level differences, but rather aim to capture ICLDs and ICTDs which when reproduced by the given speaker system accurately render the auditory perspective. Furthermore, we leverage recent advances in the field of higher-order microphones [19] [22] which greatly extend the design space of directivity patterns enabling the implementation of more sophisticated and powerful recording concepts than possible with standard first-order microphones. Sound-field sampling performed by low channel count systems as considered in this paper is too sparse to allow for reconstruction with any reasonable physical accuracy. However, such low channel-count systems should at least be capable of rendering meaningful approximations of some very simple sound fields, and their performance for such sound fields can provide insight into the effects of various design choices, guiding some high-level design decisions, and narrowing down the design space. Therefore, in Section II we study reproduction capabilities of circular multichannel systems for monochromatic plane waves based on the concept of active intensity [8], [23]. One of the conclusions of this analysis is that cross-channel terms have an adverse effect on the size and stability of the sweet-spot. Guided by that observation, in Section III, microphone directivity design problem is then formulated as a psychoacoustic curve fitting problem aimed at capturing sound field cues which allow satisfactory rendition of the auditory perspective while suppressing undesirable cross-channel terms. Design examples for the case of pentagonal systems are then provided in Section IV. Results of formal subjective experiments comparing the performance of the proposed designs with prior-art are reported in Section V and Section VI. Conclusions are drawn in Section VII. II. ANALYTICAL CONSIDERATIONS This section is concerned with the analysis of circular microphone and loudspeaker arrays for recording and reproducing monochromatic plane waves. Consider a reproduction system which comprises loudspeakers, situated on a circle, at angles, as shown in Fig. 1(a). Assume that the loudspeaker array is centered at the origin, and that its radius is large enough so that within the listening area loudspeakers can be well approximated by plane-wave sources. The pressure component of the sound field at a position within the listening area, due to a monochromatic plane wave played by loudspeaker is given by Fig. 1. Considered multichannel (a) reproduction and (b) recording system, showing two elements of the loudspeaker and microphone arrays. where is the complex gain of the -th loudspeaker, is the wave number, and is the speed of sound. The complex pressure and velocity of the sound field are sums of individual loudspeaker components:, and,where is the air density, and is the unit vector co-directional with the acoustic axis of the -th loudspeaker (see Fig. 1(a)). The product of pressure and complex conjugate velocity is known as the complex intensity [23]. The real part of complex intensity, referred to as active intensity, is co-directional with the wave propagation [8]. The complex intensity of the considered system due to the monochromatic plane wave is given by where can be expressed as. Each component where and,with Hence, each component, where, contributes a complex intensity field which fluctuates in amplitude across the space with frequency: and propagates in the direction, which is perpendicular to the median line between channels and. Components on the other hand contribute a spatially uniform field in the direction of the -th loudspeaker. The active intensity field can be expressed as (2) (3) (4) (5) (6) (1)

3 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1655 where and,andwhere we used the identities and.the first term in (7) corresponds to a spatially-uniform active intensity field, while the second term corresponds to active intensity components that fluctuate in space in different directions and with frequencies which depend both on (see (6)) and on the temporal frequency of the sound wave. The active intensity field due to a plane wave incident from the direction, which we aim to reconstruct, is uniform in space,, i.e it has no spatial fluctuations. In order to reproduce an active intensity field largely uniform in space, the cross-channel components in the second term of (7) should ideally be eliminated or at least suppressed as much as possible. Note that cross-terms in (7) cannot be eliminated completely, because that would imply only one active channel for each incident direction, prohibiting reproduction of wave directions other than the acoustic axes of the loudspeakers. Two active channels are therefore a minimum needed for continuous panoramic reproduction. Consider the optimization problem minimizing the energy of the cross terms subject to the constraint that the uniform component is in the correct direction: Solving (8) with numerical optimization methods (e.g. the downhill simplex method [24]) reveals that solutions have only two active channels, and in particular channels and such that. Based on this observation, we focus on systems such that each plane wave direction is rendered by only the pair of adjacent loudspeakers such that. The above analysis justifies physically the design of several surround methods which minimize cross-talk between non-adjacent channels [13], [17], [18]. Perceptual studies of Lee and Rumsey [25] support this design paradigm too. Auralization of sound sources in multichannel systems also employs two channels only [26], [27] for its good stability and locatedness properties. Note, however, that the technology proposed in this paper is fundamentally different from multichannel systems which employ pairwise panning, as it aims to design the system so that it records sound field cues which enable rendering sound sources and all their reflections in a manner which makes them perceptually consistent with their original directions, as well as with the acoustics of the original venue. Assume now that the signals played back by the considered loudspeaker system are recorded by an array comprised of microphones positioned on a circle of radius,alsoatangles asshowninfig.1(b).assumefurtherthat each loudspeaker plays back the signal recorded by the corresponding microphone without mixing, as in Johnston s original PSR scheme [14] [16]. In the considered recording-playback setup, the gain of the -th loudspeaker for a plane wave incident from direction is given by (8) (9) Fig. 2. Active intensity vector plots around the center of the loudspeaker array for five loudspeakers located in the far-field at angles, and source direction,i.e.themidline between two loudspeakers. Vector lengths are proportional to the amplitude of active intensity, while the grey level represents the angular error between the active intensity vectors and the source direction.the microphone array is coincident, and the directivity patterns are: omnidirectional, I-order cardioid, and III-order cardioid. where isthewaveamplitude,and is the directivity pattern of the -th microphone, which will be assumed to be real. It is also assumed that the wave phase is zero in the center of the microphone array. The cross-terms thus have the form,where,and (10) In the above it was concluded that a wave with an angle of incidence will be captured and reproduced by two channels only, and, while the contribution of other channels will be negligibly small. This requires minimizing outside of the sector. The impact of the suppression of the cross terms, achieved by making microphones progressively more selective, as well as the effect of the sound frequency on the uniformity of the reproduced sound field is illustrated in Fig. 2. Observe that even without a careful microphone directivity design, the active intensity field reproduced using the third-order cardioid is largely uniform. When only two adjacent channels, and, are active for source angles, the active intensity field has the form (11) While the first term in (11) is uniform in space, the second one is a vector fieldinthedirectionof, i.e. the median angle between the two loudspeakers, the intensity of which fluctuates in space with frequency equal to. Recall that this fluctuating component is unavoidable if one aims to render directions other than the acoustic axes of the loudspeakers. It fol-

4 1656 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 lows from (6) that the frequency of these fluctuations increases with the angular spacing between microphones and, and also increases linearly with the temporal frequency of the sound wave. For circularly symmetric arrays, which are studied later in the paper, the spatial frequency of the fluctuating term is. This quantifies the impact of the number of channels on the size of the area with approximately accurate sound field rendition. The radius of the microphone array has an effect only on term in (11), and hence a variation in causes only translation of the reproduced sound fieldinthedirectionof. Having concluded that for physical accuracy of reconstructed monochromatic plane waves it is beneficial to minimize outside the region, in the next section we focus on the design of directivity patterns within their sectors.tothisend,weturntopsychoacoustic criteria and design so that perceived directions of rendered sound sources agree with their actual directions. This is the focus of the next section. The reader interested more in general in acoustic scene analysis using circular microphone arrays can refer to [28]. III. MICROPHONE DIRECTIVITY DESIGN The design should ensure first that the sound power in the center of the loudspeaker array is constant for all directions. If only two adjacent channels are active for each angle,the power constraint becomes for all,, [27], [29]. Under this constraint, directivity patterns are completely specified by corresponding inter-channel level ratios as (12) In the following we will conduct our analysis mainly in terms of, as it makes some derivations more intuitive. Directivity patterns will be further constrained to be positive for all, as reproduction of out-of-phase signals may cause undesirable inside-the-head locatedness effect [4, p. 136]. Within these constraints, in the following we describe two methods for specific design of the directivity patterns. The first method is applicable to coincident arrays and it aims at rendering active intensity fields co-directional with the corresponding sources in the center of the listening area. This method will be derived based on physical considerations, but we show also that it is closely related to the tangent panning law used in intensity stereophony, where it is motivated by psychoacoustic criteria. The second method is a generalization of the first approach to non-coincident arrays, and it aims at shaping the directivity patterns so that the system operates along corresponding time-intensity psychoacoustic curves. A. Design of Coincident Arrays It follows from equation (11) in the case of coincident microphone arrays, that the active intensity vector in the center of the loudspeaker array,, is co-directional with the active intensity vector of a plane wave incident from direction if and only if (13) which can be expressed in terms of the inter-channel level ratio as (14) This last expression then completely specifies corresponding directivity patterns according to (13). An interesting result is obtained by expressing (13) as For the stereophonic case with loudspeakers at, (15) reduces to (15) and (16) which is equivalent to the well-known tangent panning law [30] used in intensity stereophony. This panning law was originally derived on the basis of perceptual considerations, and for low frequencies. On the other hand, the above result shows that the tangent panning law and its periphonic extension, vector base amplitude panning (VBAP) [27], are also based on physical aspects of the reproduced sound field. Microphone directivity patterns implementing (14) will be referred to in the following as intensity directivity (ID). Recording based solely on inter-channel level differences has been applied in a number of fields, and is usually deemed to yield sharper phantom images [11]. On the other hand, methods employing both time and level differences have regularly proven among the most naturally sounding and realistic of spatial microphone techniques [11]. Another advantage of time-intensity systems, as it will become evident in the following, is that they require microphones with lower spatial selectivity, which are less challenging to design and build [19], [20], [22]. B. Design Based on Time-Intensity Psychoacoustic Laws When there is a small time delay between two channels, the perceived direction of the auditory image shifts towards the leading loudspeaker [4]. If the delay does not exceed the summing localization threshold [4], it has a significant influence on the perceived direction of the auditory event. Fig. 3 shows the stereophonic psychoacoustic curves derived by Franssen [31], that map combinations of inter-channel level difference, denoted as, and inter-channel time difference, denoted as, to the perceived directions of corresponding auditory images. The upper curve in this figure,,represents all the pairs for which the auditory event is localized at the right loudspeaker, while the lower curve,,representsthe other limit at which the auditory event is localized at the left loudspeaker. Hence, in a system with maximum ICTD of, inter-channel time and level difference pairs which traverse a

5 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1657 The constraints in (19) only specify end-points of the directivity pattern. One possible way of achieving gradual source displacement between these end points is to modify the tangent law in (14) according to (20) where has the role of reducing intensity differences to account for the presence of time differences. A particular value of is obtained by equating with,which yields (21) Fig. 3. Time-intensity psychoacoustic curves, adapted from Franssen [31]. The curves,,and represent the (ICTD,ICLD) pairs for which the auditory event is localized at the right loudspeaker, left loudspeaker, and on the midline between the loudspeakers, respectively. For a system with maximum ICTD ms, subjects localize the auditory event at the left and right loudspeaker for (ICTD,ICLD) pairs at points and, respectively. Two possible curves achieving gradual source displacement between these extremal points are plotted as solid and dashed lines. curve connecting with will create auditory events which move gradually from the right to the left loudspeaker. It should be noted that Franssen s curves were obtained using the standard stereophonic arrangement, that is, two loudspeakers separated by a base angle of 60. Williams reports similar psychoacoustic curves which can be also used in this context [13]. In the considered system, loudspeaker gains are determined by the directivity functions of the corresponding microphones: where is the direction of the sound source which creates an ICTD. A simple geometric argument (see Fig. 1(b)) shows that the time delay between microphones and,fora sound wave incident from direction,is (17) Maximal ICTDs are obtained for sources in the directions of the two microphones, i.e. for delays and. Directivity patterns which provide corresponding ICLDs, as given by curves and,mustsatisfy In the symmetric case these constraints become (18) (19) The directivity pattern defined in (20) will be referred to as timeintensity directivity (TID). It is instructive to consider in this context also the particular case of coincident arrays,, which corresponds to. One would expect that in this case, the directivity pattern in (20) would become equal to the tangent law in (14), that is also perceptually motivated. However, in the general formula (20), parameter is non-zero, and its particular value is given by (21) with, while in the case of the tangent law, which is also obtained from (21) with. This dichotomy is reconciled by noting that while Franssen s curves give minimal level difference needed to create an auditory event in the direction of one of the speakers, the tangent law uses the maximal level difference that achieves the same effect. Gradual displacement of the auditory event between the two speakers can be achieved by employing a monotonic function with either the minimal or maximal level differences needed for source auralization in loudspeaker directions. The difference lies in that the slope of at its extreme points and would need to be much higher in the case of the maximal level differences compared to the case of minimal level differences. Note that while the function in (20) provides a unifying framework for intensity and time-intensity systems, there are no findings which would make it psychoacoustically more legitimate than many other monotonic curves connecting the corresponding end time-intensity pairs. The two end points in (19) can be connected also by straight lines, and that would give (22) Owing to its shape, the directivity pattern in (22) will be referred to as time-intensity linear directivity (TILD). The two types of time-intensity curves defined by (20) and (22) are illustrated in Fig. 3 for a system with a maximum ICTD of ms. IV. DESIGN EXAMPLES Consider now circularly symmetric systems where,and, and such that all directivity functions are rotated versions of a single prototype, as would arise in systems where there is no preferred seating orientation. Further, let us focus on systems with channels, as that seems to be the minimum needed for satisfactory rendition of the auditory perspec-

6 1658 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 tive [14] [16], [32], [33] and the envelopment experience [34], and also for direct comparison with the original PSR technology and second-order ambisonics. The methodology of Section III-B can be used to design microphone directivity patterns for any array radius,aslongas the ICTD is within summing localization limits. What the optimal radius for the considered system is, if there is one, is an open question. The analysis in Section III shows that varying array radius causes only translation of the reconstructed active intensity field (see (11)). However, Johnston et al. propose 15.5 cm radius and support this design choice by anthropomorphic arguments [14] [16]. Our preliminary investigation of the naturalness of combinations of inter-aural time and level differences that a listener would experience in this same setup, based on Gaik s psychoacoustic studies [35], also supports the radius proposed by Johnston et al. [36]. Hence, in this study we focus on microphone arrays with a radius of 15.5 cm. This array radius corresponds to a maximum ICTD of 0.31 ms. Note that Williams and Franssen s curves intersect at this ICTD [13], [31], and that therefore the resulting directivity patterns would be identical regardless of which psychoacoustic curve is used. Solutions for will be sought in the form, which is a general expression of an -order axisymmetric directivity pattern [20] that can be realized through various microphone beamforming methods [19] [21]. Coefficients which yield a desired directivity pattern, as given in (14), (20) or (22), or an approximation of it, can be found using numerical optimization. Corresponding optimization criteria include: (i) constrain to match the desired directivity pattern at a number of equidistant angles in the region, and to have zeros at a number of equidistant angles within ; (ii) constrain to match the desired directivity pattern at a number of equidistant angles in the region, and constrain to be below a given threshold for ; (iii) jointly minimize the -distance from the desired pattern and the -norm in the rejection region: for some small and. Fig. 4(a) shows a fifth order approximation of the ID directivity specified by (14), obtained using method (i). A sixth oder approximation of the TILD directivity in (22), obtained using method (ii) for cm, is shown in Fig. 4(b). Although high-order microphones are or may soon become feasible owing to recent advances in the field [19], [20], [22], it is of practical interest to consider low-order approximations of desired directitivity patterns. The time-intensity TID directivity (20) for cm can be well approximated by the second-order pattern with coefficients. This approximation is obtained using optimization method (iii) with and, and is shown in the same plot with the sixth-order TILD pattern. The above optimization criteria do not restrict explicitly to be positive for all, still the amplitudes of negative lobes are negligible. Fig. 4. Polar plots of proposed directivity patterns as described in Sections III and IV, and equivalent patterns of matching and in-phase higher-order Ambisonics (HOA). Patterns shown in (a) and (b) are designed for coincident and non-coincident arrays, respectively. Parameter denotes pattern order, with indicating the exact desired design. Fig. 4 shows also other directivity patterns used in the subjective listening tests reported in the paper. These include the original Johnston s PSR directivity [15], the specifications of which can be satisfied exactly by the second-order pattern with coefficients. Horizontal higher order ambisonics (HOA) [8] is also included in the subjective tests. The pentagonal loudspeaker array is optimal for reconstruction of first and second-order circular harmonics [8]. This decoding criterion is commonly referred to as mode matching and is equivalent to a coincident array of second-order microphones, with directivity patterns specified by coefficients. In-phase decoding is an alternative ambisonics implementation that provides a larger sweet spot at the expense of poorer localization accuracy [7]. In the second-order case, this trade-off is achieved by the directivity pattern with coefficients. It is instructive to briefly consider these directivity patterns in the context of reproduction of active intensity fields corresponding to monochromatic plane waves, as discussed in Section II. Fig. 5 shows active intensity fields rendered by the considered system with five channels, and recorded using microphone arrays with PSR, TID, ID and HOA-matching directivity patterns. One can observe the benefitofhigherorder

7 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1659 Fig. 5. Active intensity vector plots around the center of a loudspeaker array as described in Fig. 2. The different mutichannel systems aim at reproducing a monochromatic plane wave of frequency, incident from direction. patterns, in particular the exact specifications (infinite order) of ID and TID patterns in terms of rendering spatially more uniform active intensity fields. It is interesting also to observe that among the second-order patterns, the TID directivity seems to yield more uniform active intensity fields than the original PSR and the HOA-matching patterns. In the next two sections, the performance of the above design examples is evaluated by means of formal listening tests. Section V is focused on the angular accuracy of rendered auditory events, while Section VI studies their perceived locatedness. V. PERCEPTUAL EVALUATION LOCALIZATION The tests reported in this section aim to assess the error between the actual and perceived directions of sound sources as rendered by different systems. 1) Experimental Setup: The subjects were seated in an acoustically isolated sound booth ms of size m m m. The test setup, illustrated in Fig. 6, had two components. The first consisted of five MACKIE HR824 active monitor loudspeakers equally spaced on a circle of radius 2 m. This system was used to play back stimuli synthesized for the tested multichannel systems. The second component consisted of eight Genelec 6010 loudspeakers positioned between two adjacent channels of the five-channel system with 8 separation. These loudspeakers were used as acoustic pointers. All loudspeakers were calibrated to a nominal level of dba [10], [37]. This nominal level is known to be somewhat uncomfortable for subjects [37, p. 287], and was therefore reduced by 3 db. All the loudspeakers were positioned at the ear level facing the subject. The subjects responses were recorded using a specially designed graphical user interface, which was displayed on a monitor placed in front of them. The subjects were instructed to face the monitor, but their heads were not physically restrained. 2) Methodology and Stimuli: Gains and delays of each microphone corresponding to simulated sources in the far-field, were calculated for eight directions corresponding to the actual directions of the acoustic pointers. The subjects task was to listen to the simulated free-field recording over the five-channel system and respond by listening to and selecting the acoustic Fig. 6. The test setup for the localization and locatedness tests. The large white loudspeakers constitute the five-channel reproduction system. The square markers indicate the position of the 8 acoustic pointers used in the localization test. The round hollow markers represent the 3 acoustic pointers used in the locatedness test and positioned in directions,which are playing the unprocessed signal as described in VI-A-2. Three considered seating orientations in the center are denoted by arrows,,and.the off-center position is located at m. pointer which is closest to the perceived direction of the auditory image. This methodology is equivalent to the source identification method [38]. Three seating orientations in the center,,,and, and one position 30 cm off-center,asshown in Fig. 6, were used to test localization performance. Windowed white Gaussian noise of 0.1 s duration was used as a stimulus, and it was generated at each trial in order to eliminate bias due to fixed stimulus spectrum. The noise stimulus was selected for its wide frequency content, providing strong ILD and ITD cues [39]. A Tukey window with 30% taper ratio was used to reduce the effect of loudspeakers transient response. The sampling frequency was 44.1 khz. A. Preliminary Test 1) Subjects: A preliminary experiment was carried out with six subjects (5 male and 1 female) with no reported hearing impairments. Three of the subjects were the authors of this paper. 2) Multichannel Systems Considered: In the preliminary tests we studied microphone arrays of radius 15.5 cm with the PSR,TILD,andID, directivity patterns. HOA-in-phase was also included in this test. It was the only methodology that did not make use of ICTDs.

8 1660 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 Fig. 7. Results of preliminary localization test mean response angle as a function of stimulus angle averaged across all subjects, for the front-seating direction. Stimuli and response angles are relative to the listener facing direction. The error bars show the 95% confidence intervals. Note, that the ID directivity is derived assuming a coincident microphone array, and is therefore overly selective for the near-coincident 15.5 cm array. We used it in this preliminary test with the 15.5 cm array to assess merits of employing what according to psychoacoustics would be correct time and level differences, as well as the sensitivity of the system to using their precise values. Considering one additional system, ID, which deviates in the opposite direction from the proposed TILD directivity than the PSR design, enables making more credible conclusions about the investigated issues. 3) Results: Altogether, 90 responses were collected for each investigated system and for each stimulus direction. In Fig. 7 the mean responses are shown as a function of the stimulus direction. It can be observed that the localization accuracy in the frontal direction obtained with the TILD directivity is not only the highest of all tested systems on average, but is also the highest for each stimulus direction. It can be further observed that in the case of the system with the ID directivity pattern, the auditory images are not rendered uniformly across the span, but tend to concentrate at the loudspeaker closer to the stimulus direction. The opposite effect can be observed with HOA-in-phase and PSR systems, i.e. auditory images concentrate around the middle of the range. 4) Discussion: These results can be explained based on considerations in Section III-B. It follows from (17), that for the five-channel system with cm, the ICTD of a stimulus in the direction of one of the speakers is ms. Frannsen s curves in Fig. 3 indicate that the level difference between the two speakers needed in combination with this time difference to create an auditory event in the direction of one of the speakers is 9.60 db, and thatiswhatthetilddesignaimsto achieve. The corresponding level differences of the PSR is pattern is 3 db, which is insufficient to achieve the desired source displacement from the center. On the other hand, the level difference of the ID design for the same direction is db, which is in excess of what is needed for the desired displacement, and is a consequence of the fact that the ID pattern is designed so as to provide desired source auralization without inter-channel time differences. Notice from Fig. 4 that the HOA-in-phase provides higher level differences than the PSR pattern. Even so, due to the absence of time differences, it renders auditory images closer to the center. Finally, from Fig. 3 it can be observed that with an ICLD of 3 db, the ICTD needed to create an auditory image in the direction of one of the loudspeakers is approximately 2 ms, which corresponds to m. This explains the observation made in [40] that PSR arrays with larger radii resulted in higher subjective ratings. The same observation was also shared by J. D. Johnston in a personal communication. The results shown in Fig. 7 demonstrate that fine tuning of time and level differences does matter for rendering accurate auditory perspective, but the graceful degradation of the performance as the directivity departs from the TILD towards the ID and PSR patterns suggests that the technology is not very sensitive to deviations of its parameters from ideal ones. In the next subsection we present results of our main listening test which includes HOA and the coincident version of the ID array, that both achieve high accuracy for a listener in the center of the loudspeaker array. B. Main Test 1) Subjects: Sixteen naïve [41] subjects (13 male and 3 female), with no reported hearing impairments participated in the main test. All subjects but one were students aged years old. 2) Multichannel Systems Considered: Four multichannel methodologies were included in this localization test. One of them is the second-order TID approximation presented in Section IV. Two versions of second-order ambisonics, i.e. HOA-in-phase, and a state-of-the-art HOA implementation, were also considered. Although a HOA standard is yet to be defined, literature suggests that there is a general understanding of what its state-of-the-art implementation should involve [8], [9], [42], [43]. First, mode-matching decoding is applied at low frequencies [8]. At high-frequencies, physical reconstruction becomes infeasible, therefore different criteria should be employed. Poletti suggests weighting circular harmonic components using a Kaiser window [8]. Daniel et al. propose alternative formulas which maximize the so-called energy vector [9]. This latter approach is preferred here for its compliance with Ambisonics first-order version originally proposed by Gerzon [44]. According to [9], the cross-fading frequency between mode-matching and maximum energy decoding is set to 1.2 khz. The cross-fading filters are designed as phase-matched second-order IIR filters as described in [42]. To compensate for the finite distance of loudspeakers, the near-field correction described in [43] is also implemented. The above three multichannel methodologies all use second-order directivity patterns. The ID directivity is too selective to be well approximated by low order patterns. Hence, it is implemented in its exact formulation (14), and serves as an additional benchmark illustrating the performance of intensity-based systems, i.e. coincident arrays, and the equivalent tangent panning law. Note, however, that high order microphones are more challenging to design and build [19], [20], [22], and the results should be interpreted accordingly. 3) Methodology: The test methodology used in the preliminary test was amended in two ways to elicit more information from subjects. Firstly, subjects could also report whether an

9 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1661 Fig. 8. Results of the main localization test mean responses with 95% confidence intervals as a function of stimulus angle, for (a) the front-seating direction, (b) side-seating direction, (c) back-seating direction. The bar plots indicate the percentage of times subjects chose the leftward and rightward buttons, as described in Section V-B-3. event was perceived to the left or to the right of the pointer array. Specifically, they were instructed to choose the leftward button when the event was beyond the midline between the last pointer on their left and speaker (see Fig. 6), and the angle associated with these responses was the direction of speaker. The case of auditory events perceived to the right of the pointer array was treated analogously. Secondly, subjects were asked to make open comments at the end of each test block. 4) Results and Discussion: Altogether, 64 responses were collected for each investigated system and for each stimulus direction. The mean angular responses are shown in Fig. 8(a), along with the percentage of leftward and rightward decisions. HOA-in-phase has high localization error. This is consistent with the findings of the preliminary test. All other systems give very low errors, with HOA being the smallest. Considering that the localization blur of the auditory system is in the 1 4 range [4], the differences between TID, ID and HOA are not perceptually significant in the frontal seating position. All subjects reported that whenever they used the extreme buttons the auditory event was perceived at the directions of speakers or, or in their close proximity. When asked for open comments, one reported some front-back confusions [4], while two reported it was harder to judge events in directions close to the midline between and, which was also observed in [39]. Observe that the TID system exhibited a slight bias toward the midline. This is likely due to the insufficiency of employed level differences, since in the absence of psychoacoustic laws governing source auralization over the 72 range, we designed all considered time-intensity systems using Franssen s curves that are obtained in measurements using the 60 set-up. On the other hand, the ID system tends to pull auditory events closer to loudspeakers (the so-called detent effect [29], [45]), and yields the highest number of extreme decisions. This effect is in agreement with the fact that the tangent law in (14) uses extreme level differences, and it can be probably corrected by either using an intensity law which exhibits a faster change at end points and, or by amending it according to (20) with selected so that smaller or minimal level differences are used at the end points. In the side and back seating directions, only TID and HOA were tested in order to maintain moderate test duration. TID and HOA were chosen because they yield low angular error in the frontal direction with second-order designs. Results for the side seating direction show higher error and variability of responses, reflecting the lower localization accuracy of the auditory system in this region [4]. Fig. 8(b) shows that responses are biased towards the loudspeaker. Most subjects reported using the leftward button when the auditory event was located at or in its close vicinity, while two subjects perceived some events somewhere between and. In the seating position the localization accuracy of both HOA and TID improves as compared with, as shown in Fig. 8(c). Three subjects reported some front/back confusions. HOA achieves lower error than TID for both and orientations, the difference being comparable to the localization blur in corresponding sectors [4]. The TID directivity is designed based on frontal time-intensity trading curves, hence as it could be expected, the system has good performance for frontal sources, while being inferior to HOA for sound sources coming from the sides and the back. In applications with no preferred orientation, localization accuracy in the frontal direction will be more important, given also that the spatial resolution of the auditory system is highest in the frontal sector. If, on the other hand, the system is designed for reproduction in a situation where listener s orientation is fixed and known, the localization accuracy of time-intensity systems on the sides and in the back can be improved by designing direc-

10 1662 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 Fig. 9. Results of the main localization test mean responses with 95% confidence intervals as a function of stimulus angle, for the position. tivity patterns of side and back microphones according to corresponding time-intensity curves [39]. Fig. 9 shows results obtained for the off-center seating position. A strong shift towards the direction of the closest active loudspeaker is present, which could be expected [46]. Most of the responses for stimulus angles between and were at or beyond the last pointer. HOA-in-phase is more accurate in this region, possibly due to its lower inter-channel level differences. As the stimulus angle shifts to the right, the other methodologies outperform HOA-in-phase, particularly the ID system. Subjects reported a significant amount of leftward decisions for HOA, even for stimulus directions close to the midline. Most subjects reported using the leftward button when the auditory event was perceived in directions between the loudspeaker and the midline between and. Four subjects reported that auditory events perceived in directions near the midline between and were blurry or hard to localize. This is in agreement with results of the locatedness test presented in the Section VI. VI. PERCEPTUAL EVALUATION LOCATEDNESS An important perceptual attribute of multichannel reproduction is the locatedness of phantom sources, defined as the degree of certainty about the location of auditory events [39]. In this section we present results of a listening test which evaluates considered systems in terms of the locatedness of auditory events they produce. 1) Subjects: Twelve of the sixteen participants of the test described in Section V-B also took the locatedness test around a month after the first test. An additional seven naïve [41] subjects were recruited for this test. The subject group thus consisted of nineteen listeners (17 male and 2 female). 2) Methodology and Stimuli: Subjects responded to the questions How well can you assign a particular direction to the perceived source? [47], How certain are you of the direction of the source? [39] with a score on a continuous scale from 0 to 100. The scale was divided into five equal intervals labeled as I am certain, I have a slight doubt, I have a doubt, I am really not sure and I have no idea, as suggested in [39]. Subjects were instructed to ignore any other audible attribute, such as pitch, tonal coloration and, more importantly, the specificperceived source direction. The studied surround systems were compared directly, in a manner similar to MUSHRA tests [48]. Two additional sounds with known characteristics were also included among the systems to grade. The first was the unprocessed signal played by a MACKIE HR824 loudspeaker positioned in the direction of the emulated plane wave, which will be referred to as real. The second, which will be referred to as diffuse anchor, was an approximation of a diffuse sound field obtained by playing over all the 5 channels the unprocessed signal convolved with uncorrelated 10 ms long sequences [49]. The subjects did not know which system they were grading at any time, and the presentation order was randomized at each iteration. The same experiment was repeated under different conditions by varying the seating position, plane wave incident direction, and sound excerpt. The central seating position and offcenter position were used. The surround systems were simulated to reproduce virtual sources in the far field at three different directions,asdepictedinfig.6.inthe central position, only the frontal and right directions were investigated. Three anechoic excerpts from Bang & Olufsen Music for Archimedes CD were used as representatives of common program material female speech, African bongo and cello [39]. The excerpts were faded off after around 5 s. To avoid bias due to persistence effects [4], listeners could not switch between presentations before the whole excerpt was played. Each subject ran a training session before the actual test, which allowed them to listen to all possible sound excerpts and to familiarize themselves with the grading system. 3) Results and Discussion: A pilot experiment showed that a higher uncertainty was associated with responses in the offcenter position in the case of right and frontal incident directions. Therefore, a higher number of repetitions were used for these conditions. Altogether, each system was graded 36 times for the right direction and 30 time for the frontal direction in the off-center position. In the remaining three cases, i.e. the left direction in the off-center position and the two directions in the central seating position, each system was graded 27 times in total. Results of the locatedness test are shown in Fig. 10. As expected, the real and diffuse anchors have under all conditions the highest and lowest mean scores, respectively. In the central position all surround methodologies have high mean scores in both the frontal and right directions. In the off-center position, the following observations can be made. For subjects localize most auditory events at or near (see Section V), and locatedness scores are still high. Paired Student t-tests [50] reveal that HOA-in-phase has significantly lower mean locatedness score than both TID and ID.For the middle and right directions in the off-center position, HOA has significantly lower mean scores than TID ( and

11 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1663 Fig. 10. Subjective assessment of locatedness showing mean response scores and 95% confidence intervals of experiments for seating positions or refer to Fig. 6 and three incident directions., respectively) and the other two systems at the 5% significance level. Interestingly, for, TID has significantly higher mean score than ID ( ). Note that high locatedness scores are not necessarily preferable over low ones. In fact, depending on the intended application, a sound engineer could favor systems which blur the acoustic perspective [39]. However, the variability of mean scores of HOA indicates that the acoustic perspective changes significantly when moving 30 cm from the center of the loudspeaker array. For instance, when reproducing a plane wave incident from the right direction, the mean locatedness score drops from 83.7 to 51.6 (unpaired t-test is significant with ). A similar difference in mean locatedness scores is observed for a listener in the off-center position between left and middle incident directions, and right and middle incident directions. Results in Fig. 10 show that TID is the system with the most uniform mean scores across incidence directions and seating positions. Note that results of our tests disagree with general observations made in earlier studies by Lipshitz and Linkwitz [51], [52], that recommend against non-coincident methods (especially those with very large inter-microphone distances) because they are considered to produce a higher phasiness and imaging blur, or, equivalently, inferior locatedness. This disagreement could be a results of differences in time-intensity systems considered in this work and these previous studies. Reconciling these differences requires further research. VII. CONCLUSIONS We presented an analysis of circular microphone arrays in the context of panoramic audio recording and reproduction. The analysis was first based on the concept of active intensity and was focused on the performance of arrays in recording and reproduction of monochromatic plane waves. The analysis showed that cross-channel terms have a detrimental effect on the direction and the uniformity of reproduced sound fields, leading to the conclusion that using not more than two active channels for rendition of plane waves reduces spatial fluctuations and error of the reproduced sound fields. This analysis is then refined to include psychoacoustic phenomena, leading to a methodology for the design of circular microphone arrays for panoramic recording of acoustic events. As a result, an intensity based design named ID is first proposed. This approach was then generalized to a broader class of arrays named TID that capture inter-channel time and level differences needed for accurate directional reproduction of all sources and reflections of a recorded event. Design examples were given for the case of pentagonal microphone and loudspeaker arrays. Subjective listening tests were carried out in order to evaluate the performance of the proposed designs in comparison with several other multichannel systems. These systems included the PSR array proposed in [15], and two second-order ambisonics versions, HOA and HOA-in-phase. Results of a localization experiment showed that angular error in front of the listener is high for PSR and HOA-in-phase, while HOA, ID and TID all achieve high accuracy. HOA is more accurate than TID for sources at the side and back of the listener, the difference being comparable to the localization blurs in these directions [4]. The localization accuracy of the ID and TID methods degrade more gracefully in an off-center position, than HOA, with the ID system degrading the least. However, as opposed to the other considered systems, ID cannot be approximated effectively with second-order patterns, and therefore requires more sophisticated microphones. Results of a locatedness experiment have shown that HOA degrades the most at a position 30 cm off-center, while TID yields most consistent responses across seating positions and source incidence angles. The present study did not address the issue of optimal array radius, but rather proposed a methodology of shaping microphone directivity given array radius and the number of channels to render satisfactory auditory perspective. This leaves array radius as a free parameter that can be used to optimize other perceptual criteria or possibly meet some practical implementation requirements. For play-back systems with aprioriknown fixed listener s orientation, the localization performance of time-intensity systems on the sides and to the back can be probably improved by making use of psychoacoustic curves corresponding to those directions. However, such further optimization of the time-intensity approach along with optimization of array radius are a matter of future research.

12 1664 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 8, AUGUST 2013 ACKNOWLEDGMENT The authors would like to thank Yaqub Alwan, James Hall, James Johnston, Francis Rumsey, and Peter Sollich for insightful discussions on topic and all the volunteers for participating in the listening tests. REFERENCES [1] H. HacıhabiboğluandZ.Cvetković, Panoramic recording and reproduction of multichannel audio using a circular microphone array, in Proc. IEEE Workshop Appl. Signal Process. Audio and Acoust. (WASPAA 09), New Paltz, NY, USA, Oct. 2009, pp [2] E. de Sena, H. Hacıhabiboğlu, and Z. Cvetković, Perceptual evaluation of a circularly symmetric microphone array for panoramic recording of audio, in Proc. 2nd Int. Symp. Ambison., Spher. Acoust., Paris, France, [3] H. Hacıhabiboğlu,E.DeSena,andZ.Cvetković, Design of a circular microphone array for panoramic audio recording and reproduction: Microphone directivity, in AES 128th Conv., London, U.K., May 2010, Preprint #8063. [4] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization. Cambridge, MA, USA: MIT Press, [5] W. G. Gardner, 3-D Audio Using Loudspeakers. Norwell, MA, USA: Kluwer, [6] M.M.Boone,U.Horbach,andW.P.J.Bruijn, Spatial sound-field reproduction by wave-field synthesis, J. Audio Eng. Soc., vol. 43, no. 12, pp , Dec [7] J. Daniel, Représentation de champs acoustiques, application à la transmission et à la reproduction de scènes sonores complexes dans un contexte multimédia, Ph.D. dissertation, Univ. of Paris VI, Paris, France, [8] M. A. Poletti, A unified theory of horizontal holographic sound systems, J. Audio Eng. Soc., vol. 48, no. 12, pp , Dec [9] J. Daniel, J. Rault, and J. Polack, Ambisonics encoding of other audio formats for multiple listening conditions, in AES 105th Conv., San Francisco, CA, USA, Sep. 1998, Preprint #4795. [10] ITU-R, Rec. BS [11] F. Rumsey, Spatial Audio. Oxford, U.K.: Focal Press, [12] J. Eargle, The Microphone Book. Oxford, U.K.: Focal Press, [13] M. Williams and G. Le Du, Microphone array analysis for multichannel sound recording, in AES 107th Conv., NewYork,NY,USA, Sep. 1999, Preprint #4997. [14] J. D. Johnston and Y. H. Lam, Perceptual soundfield reconstruction, in AES 109th Conv., Los Angeles, CA, USA, Sep. 2000, Preprint #2399. [15] J. D. Johnston and E. R. Wagner, Microphone array for preserving soundfield perceptual cues, U.S. patent, US 6,845,163 B1, Jan [16] G. L. Rosen and J. D. Johnston, Automatic speaker directivity control for sound field, in Proc. AES 19th Int. Conf., Schloss Elmau, Germany, Jun [17] G. Theile, Multichannel natural music recording based on psychoacoustic principles, in Proc. AES 19th Int. Conf., SchlossElmau,Germany, Jun [18] A. Laborie, R. Bruno, and S. Montoya, High spatial resolution multichannel recording, in Proc. AES 116th Conv.,Berlin,Germany,2004, Preprint #6166. [19] J. Meyer and G. W. Elko, A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield, in Proc. IEEE Int. Conf. Acoust. Speech, Signal Process. (ICASSP 93), Minneapolis, MN, USA, April [20] E. De Sena,H.Hacıhabiboğlu, and Z. Cvetković, On the design and implementation of higher order differential microphones, IEEE Trans. Audio, Speech, Lang. Process., vol. 20, no. 1, pp , Jan [21] B. Rafaely, Design of a second-order soundfield microphone, in Proc. AES 118th Conv., Barcelona, Spain, May 2005, Preprint #6405. [22] S. Doclo and M. Moonen, Design of broadband beamformers robust against gain and phase errors in the microphone array characteristics, IEEE Trans. Signal Process., vol. 51, no. 10, pp , Oct [23] R. D. Heyser, Instantaneous intensity, in Proc. AES 81st Conv., Los Angeles, CA, USA, Nov. 1986, Preprint #2399. [24] J. Nelder and R. Mead, A simplex method for function minimization, Comput. J., vol. 7, no. 4, pp , Jan [25] H.-K. Lee and F. Rumsey, Investigation into the effect of interchannel crosstalk in multichannel microphone technique, in Proc. AES 118th Conv., Barcelona, Spain, May 2005, Preprint #6405. [26] J. Borenius, Moving sound image in the theaters, J. Audio Eng. Soc., vol. 25, no. 4, pp , [27] V. Pulkki, Virtual sound source positioning using vector-base amplitude panning, J. Audio Eng. Soc., vol. 45, no. 6, pp , Jun [28] H. Teutsch, Modal Array Signal Processing: Principles and Applications of Acoustic Wavefield Decomposition. New York, NY, USA: Springer, [29] S.-L.Lee,K.-Y.Han,S.-R.Lee,andK.-M.Sung, Reductionofsound localization error for surround sound system using enhanced constant power panning law, IEEE Trans. Consum. Electron., vol. 50, no. 3, pp , Aug [30] B. Bernfeld, Attempts for better understanding of the directional stereophonic listening mechanism, in Proc. AES 44th Conv., Rotterdam, The Netherlands, Mar. 1973, Preprint #C-4. [31] N. V. Franssen, Stereophony. Eindhoven, The Netherlands: Philips Research Laboratories, [32] H. Fletcher, Speech and Hearing in Communication. NewYork,NY, USA: van Nostrand, [33] P. A. Ratliff, Properties of hearing related to quadraphonic reproduction, Research Dept., BBC, 1974, Tech. Rep.. [34] Y. Ando and K. Kurihara, Nonlinear response in evaluating the subjective diffuseness of sound fields, J. Acoust. Soc. Amer., vol. 80, no. 3, pp , Sep [35] W. Gaik, Combined evaluation of interaural time and intensity differences: Psychoacoustic results and computer modeling., J. Acoust. Soc. Amer., vol. 94, no. 1, pp , Jul [36] E. De Sena, H. Hacıhabiboğlu, and Z. Cvetković, Design of a circular microphone array for panoramic audio recording and reproduction: Array radius, in Proc. AES 128th Conv., London, U.K., May 2010, Preprint #8064. [37] S. Bech and N. Zacharov, Perceptual Audio Evaluation: Theory, Method and Application. New York, NY, USA: Wiley, [38] W. Hartmann, B. Rakerd, and J. Gaalaas, On the source-identification method, J. Acoust. Soc. Amer., vol. 104, pp , Dec [39] L. S. R. Simon and R. Mason, Time and level localization curves for a regularly-spaced octagon loudspeaker array, in Proc. AES 128th Conv., London, U.K., May 2010, Preprint #8079. [40] J. Hall and Z. Cvetković, Coherent multichannel emulation of acoustic spaces, in Proc. AES 28th Int. Conf., Piteå, Sweden, Jun [41] ISO, , Sensory Analysis General Guidance for the Selection, Training and Monitoring of Assessors Part 2: Experts, [42] A. Heller, R. Lee, and E. Benjamin, Is my decoder ambisonic, in Proc. AES 125th Conv., London, UK, Oct. 2008, Preprint #7553. [43] J. Daniel, Spatial sound encoding including near field effect: Introducing distance coding filters and a viable, new ambisonic format, in Proc. AES 23rd Int. Conf., Copenhagen, Denmark, May [44] M. Gerzon, General metatheory of auditory localisation, in Proc. AES 92nd Conv., Vienna, Austria, Mar. 1992, Preprint #3306. [45] S. Jeon, Y. Park, S. Lee, and D. Youn, Virtual source panning using multiple-wise vector pase in the multispeaker stereo format, in Proc. 19th Eur. Signal Process. Conf. (EUSIPCO 11), Barcelona, Spain, Sep. 2011, pp [46] J. Ródenas, R. Aarts, and A. Janssen, Derivation of an optimal directivity pattern for sweet spot widening in stereo sound reproduction, J. Acoust. Soc. Amer., vol. 113, pp , [47] H. Wittek, F. Rumsey, and G. Theile, Perceptual enhancement of wavefield synthesis by stereophonic means, J. Audio Eng. Soc., vol. 55, no. 9, pp , Sep [48] ITU-R, Rec. BS , [49] G. Kendall, The decorrelation of audio signals and its impact on spatial imagery, Comput. Music J., vol. 19, no. 4, pp , [50] T. Sporer, J. Liebetrau, and S. Schneider, Statistics of MUSHRA revisited, in Proc. AES 127th Conv., New York, NY, USA, 2009, Preprint #7825. [51] S. Lipshitz, Stereo microphone techniques: Are the purists wrong?, J.AudioEng.Soc, vol. 34, no. 9, pp , [52] S. Linkwitz, A model for rendering stereo signals in the itd-range of hearing, in Proc. AES 133rd Conv., San Francisco, CA, USA, 2012, Preprint #8713.

13 DE SENA et al.: ANALYSIS AND DESIGN OF MULTICHANNEL SYSTEMS FOR PERCEPTUAL SOUND FIELD RECONSTRUCTION 1665 Enzo De Sena (S 11) was born in Napoli, Italy, in He received the B.Sc. degree in 2007 and M.Sc. degree (cum laude) in 2009 both from the Università degli Studi di Napoli Federico II, Napoli, Italy, in telecommunications engineering. He is currently pursuing the Ph.D. degree in electronic engineering at King s College London, London, U.K., and is also serving as a Teaching Fellow at the same university. His research interests include multichannel audio, spatial hearing, room acoustics simulation, and microphone array processing. Hüseyin Hacıhabiboğlu (S 96 M 00 SM 12) received the B.Sc. (honors) degree from the Middle East Technical University (METU), Ankara, Turkey, in 2000, the M.Sc. degree from the University of Bristol, Bristol, U.K., in 2001, both in electrical and electronic engineering, and the Ph.D. degree in computer science from Queen s University Belfast, Belfast, U.K., in He held research positions at University of Surrey, Guildford, U.K. ( ) and King s College London, London, U.K. ( ). Currently, he is an Assistant Professor and Head of Department of Modelling and Simulation at Graduate School of Informatics, Middle East Technical University, Ankara, Turkey. His research interests include audio signal processing, room acoustics, multichannel audio systems, psychoacoustics of spatial hearing, microphone arrays, and game audio. Dr. Hacıhabiboğlu is a member of the IEEE Signal Processing Society, Audio Engineering Society (AES), Turkish Acoustics Society (TAD), and the European Acoustics Association (EAA). Zoran Cvetković (SM 04) received the Dipl.Ing.El. and Mag.El. degrees from the University of Belgrade, Belgrade, Yugoslavia, in 1989 and 1992, respectively, the M.Phil. degree from Columbia University, New York, in 1993, and the Ph.D. degree in electrical engineering from the University of California, Berkeley, in He held research positions at EPFL, Lausanne, Switzerland (1996), and at Harvard University, Cambridge, MA ( ). From 1997 to 2002, he was a Member of Technical Staff at AT&T Shannon Laboratory. He is now a Professor in Signal Processing at Kings College London, London, U.K. His research interests are in the broad area of signal processing, ranging from theoretical aspects of signal analysis to applications in telecommunications, audio and speech technologies, and biomedical engineering.

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Lee, Hyunkook Capturing and Rendering 360º VR Audio Using Cardioid Microphones Original Citation Lee, Hyunkook (2016) Capturing and Rendering 360º VR Audio Using Cardioid

More information

Convention Paper Presented at the 128th Convention 2010 May London, UK

Convention Paper Presented at the 128th Convention 2010 May London, UK Audio Engineering Society Convention Paper Presented at the 128th Convention 21 May 22 25 London, UK 879 The papers at this Convention have been selected on the basis of a submitted abstract and extended

More information

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques:

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques: Multichannel Audio Technologies More on Surround Sound Microphone Techniques: In the last lecture we focused on recording for accurate stereophonic imaging using the LCR channels. Today, we look at the

More information

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA)

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA) H. Lee, Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA), J. Audio Eng. Soc., vol. 67, no. 1/2, pp. 13 26, (2019 January/February.). DOI: https://doi.org/10.17743/jaes.2018.0068 Capturing

More information

Spatial audio is a field that

Spatial audio is a field that [applications CORNER] Ville Pulkki and Matti Karjalainen Multichannel Audio Rendering Using Amplitude Panning Spatial audio is a field that investigates techniques to reproduce spatial attributes of sound

More information

THE TEMPORAL and spectral structure of a sound signal

THE TEMPORAL and spectral structure of a sound signal IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 1, JANUARY 2005 105 Localization of Virtual Sources in Multichannel Audio Reproduction Ville Pulkki and Toni Hirvonen Abstract The localization

More information

Psychoacoustic Cues in Room Size Perception

Psychoacoustic Cues in Room Size Perception Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany 6084 This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

Binaural auralization based on spherical-harmonics beamforming

Binaural auralization based on spherical-harmonics beamforming Binaural auralization based on spherical-harmonics beamforming W. Song a, W. Ellermeier b and J. Hald a a Brüel & Kjær Sound & Vibration Measurement A/S, Skodsborgvej 7, DK-28 Nærum, Denmark b Institut

More information

Analysis of Frontal Localization in Double Layered Loudspeaker Array System

Analysis of Frontal Localization in Double Layered Loudspeaker Array System Proceedings of 20th International Congress on Acoustics, ICA 2010 23 27 August 2010, Sydney, Australia Analysis of Frontal Localization in Double Layered Loudspeaker Array System Hyunjoo Chung (1), Sang

More information

Virtual Sound Source Positioning and Mixing in 5.1 Implementation on the Real-Time System Genesis

Virtual Sound Source Positioning and Mixing in 5.1 Implementation on the Real-Time System Genesis Virtual Sound Source Positioning and Mixing in 5 Implementation on the Real-Time System Genesis Jean-Marie Pernaux () Patrick Boussard () Jean-Marc Jot (3) () and () Steria/Digilog SA, Aix-en-Provence

More information

Sound source localization and its use in multimedia applications

Sound source localization and its use in multimedia applications Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Moore, David J. and Wakefield, Jonathan P. Surround Sound for Large Audiences: What are the Problems? Original Citation Moore, David J. and Wakefield, Jonathan P.

More information

A Comparative Study of the Performance of Spatialization Techniques for a Distributed Audience in a Concert Hall Environment

A Comparative Study of the Performance of Spatialization Techniques for a Distributed Audience in a Concert Hall Environment A Comparative Study of the Performance of Spatialization Techniques for a Distributed Audience in a Concert Hall Environment Gavin Kearney, Enda Bates, Frank Boland and Dermot Furlong 1 1 Department of

More information

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model Sebastian Merchel and Stephan Groth Chair of Communication Acoustics, Dresden University

More information

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS 20-21 September 2018, BULGARIA 1 Proceedings of the International Conference on Information Technologies (InfoTech-2018) 20-21 September 2018, Bulgaria INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL 9th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 7 A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL PACS: PACS:. Pn Nicolas Le Goff ; Armin Kohlrausch ; Jeroen

More information

Effect of the number of loudspeakers on sense of presence in 3D audio system based on multiple vertical panning

Effect of the number of loudspeakers on sense of presence in 3D audio system based on multiple vertical panning Effect of the number of loudspeakers on sense of presence in 3D audio system based on multiple vertical panning Toshiyuki Kimura and Hiroshi Ando Universal Communication Research Institute, National Institute

More information

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION T Spenceley B Wiggins University of Derby, Derby, UK University of Derby,

More information

A triangulation method for determining the perceptual center of the head for auditory stimuli

A triangulation method for determining the perceptual center of the head for auditory stimuli A triangulation method for determining the perceptual center of the head for auditory stimuli PACS REFERENCE: 43.66.Qp Brungart, Douglas 1 ; Neelon, Michael 2 ; Kordik, Alexander 3 ; Simpson, Brian 4 1

More information

MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES. Toni Hirvonen, Miikka Tikander, and Ville Pulkki

MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES. Toni Hirvonen, Miikka Tikander, and Ville Pulkki MULTICHANNEL REPRODUCTION OF LOW FREQUENCIES Toni Hirvonen, Miikka Tikander, and Ville Pulkki Helsinki University of Technology Laboratory of Acoustics and Audio Signal Processing P.O. box 3, FIN-215 HUT,

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

Nonuniform multi level crossing for signal reconstruction

Nonuniform multi level crossing for signal reconstruction 6 Nonuniform multi level crossing for signal reconstruction 6.1 Introduction In recent years, there has been considerable interest in level crossing algorithms for sampling continuous time signals. Driven

More information

Is My Decoder Ambisonic?

Is My Decoder Ambisonic? Is My Decoder Ambisonic? Aaron J. Heller SRI International, Menlo Park, CA, US Richard Lee Pandit Litoral, Cooktown, QLD, AU Eric M. Benjamin Dolby Labs, San Francisco, CA, US 125 th AES Convention, San

More information

Sound localization with multi-loudspeakers by usage of a coincident microphone array

Sound localization with multi-loudspeakers by usage of a coincident microphone array PAPER Sound localization with multi-loudspeakers by usage of a coincident microphone array Jun Aoki, Haruhide Hokari and Shoji Shimada Nagaoka University of Technology, 1603 1, Kamitomioka-machi, Nagaoka,

More information

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4 SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................

More information

Introduction. 1.1 Surround sound

Introduction. 1.1 Surround sound Introduction 1 This chapter introduces the project. First a brief description of surround sound is presented. A problem statement is defined which leads to the goal of the project. Finally the scope of

More information

The analysis of multi-channel sound reproduction algorithms using HRTF data

The analysis of multi-channel sound reproduction algorithms using HRTF data The analysis of multichannel sound reproduction algorithms using HRTF data B. Wiggins, I. PatersonStephens, P. Schillebeeckx Processing Applications Research Group University of Derby Derby, United Kingdom

More information

Discrimination of Virtual Haptic Textures Rendered with Different Update Rates

Discrimination of Virtual Haptic Textures Rendered with Different Update Rates Discrimination of Virtual Haptic Textures Rendered with Different Update Rates Seungmoon Choi and Hong Z. Tan Haptic Interface Research Laboratory Purdue University 465 Northwestern Avenue West Lafayette,

More information

Development and application of a stereophonic multichannel recording technique for 3D Audio and VR

Development and application of a stereophonic multichannel recording technique for 3D Audio and VR Development and application of a stereophonic multichannel recording technique for 3D Audio and VR Helmut Wittek 17.10.2017 Contents: Two main questions: For a 3D-Audio reproduction, how real does the

More information

M icroph one Re cording for 3D-Audio/VR

M icroph one Re cording for 3D-Audio/VR M icroph one Re cording /VR H e lm ut W itte k 17.11.2016 Contents: Two main questions: For a 3D-Audio reproduction, how real does the sound field have to be? When do we want to copy the sound field? How

More information

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING

DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING DESIGN OF ROOMS FOR MULTICHANNEL AUDIO MONITORING A.VARLA, A. MÄKIVIRTA, I. MARTIKAINEN, M. PILCHNER 1, R. SCHOUSTAL 1, C. ANET Genelec OY, Finland genelec@genelec.com 1 Pilchner Schoustal Inc, Canada

More information

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS

HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS HARMONIC INSTABILITY OF DIGITAL SOFT CLIPPING ALGORITHMS Sean Enderby and Zlatko Baracskai Department of Digital Media Technology Birmingham City University Birmingham, UK ABSTRACT In this paper several

More information

Accurate sound reproduction from two loudspeakers in a living room

Accurate sound reproduction from two loudspeakers in a living room Accurate sound reproduction from two loudspeakers in a living room Siegfried Linkwitz 13-Apr-08 (1) D M A B Visual Scene 13-Apr-08 (2) What object is this? 19-Apr-08 (3) Perception of sound 13-Apr-08 (4)

More information

Audio Engineering Society. Convention Paper. Presented at the 115th Convention 2003 October New York, New York

Audio Engineering Society. Convention Paper. Presented at the 115th Convention 2003 October New York, New York Audio Engineering Society Convention Paper Presented at the 115th Convention 2003 October 10 13 New York, New York This convention paper has been reproduced from the author's advance manuscript, without

More information

Convention Paper Presented at the 128th Convention 2010 May London, UK

Convention Paper Presented at the 128th Convention 2010 May London, UK Audio Engineering Society Convention Paper Presented at the 128th Convention 2010 May 22 25 London, UK The papers at this Convention have been selected on the basis of a submitted abstract and extended

More information

MEASURING DIRECTIVITIES OF NATURAL SOUND SOURCES WITH A SPHERICAL MICROPHONE ARRAY

MEASURING DIRECTIVITIES OF NATURAL SOUND SOURCES WITH A SPHERICAL MICROPHONE ARRAY AMBISONICS SYMPOSIUM 2009 June 25-27, Graz MEASURING DIRECTIVITIES OF NATURAL SOUND SOURCES WITH A SPHERICAL MICROPHONE ARRAY Martin Pollow, Gottfried Behler, Bruno Masiero Institute of Technical Acoustics,

More information

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany Audio Engineering Society Convention Paper Presented at the 16th Convention 9 May 7 Munich, Germany The papers at this Convention have been selected on the basis of a submitted abstract and extended precis

More information

Envelopment and Small Room Acoustics

Envelopment and Small Room Acoustics Envelopment and Small Room Acoustics David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 Copyright 9/21/00 by David Griesinger Preview of results Loudness isn t everything! At least two additional perceptions:

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST PACS: 43.25.Lj M.Jones, S.J.Elliott, T.Takeuchi, J.Beer Institute of Sound and Vibration Research;

More information

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1.

EBU UER. european broadcasting union. Listening conditions for the assessment of sound programme material. Supplement 1. EBU Tech 3276-E Listening conditions for the assessment of sound programme material Revised May 2004 Multichannel sound EBU UER european broadcasting union Geneva EBU - Listening conditions for the assessment

More information

A spatial squeezing approach to ambisonic audio compression

A spatial squeezing approach to ambisonic audio compression University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2008 A spatial squeezing approach to ambisonic audio compression Bin Cheng

More information

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA

Surround: The Current Technological Situation. David Griesinger Lexicon 3 Oak Park Bedford, MA Surround: The Current Technological Situation David Griesinger Lexicon 3 Oak Park Bedford, MA 01730 www.world.std.com/~griesngr There are many open questions 1. What is surround sound 2. Who will listen

More information

Convention Paper Presented at the 138th Convention 2015 May 7 10 Warsaw, Poland

Convention Paper Presented at the 138th Convention 2015 May 7 10 Warsaw, Poland Audio Engineering Society Convention Paper Presented at the 38th Convention 25 May 7 Warsaw, Poland This Convention paper was selected based on a submitted abstract and 75-word precis that have been peer

More information

Binaural Hearing. Reading: Yost Ch. 12

Binaural Hearing. Reading: Yost Ch. 12 Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to

More information

Simulation of realistic background noise using multiple loudspeakers

Simulation of realistic background noise using multiple loudspeakers Simulation of realistic background noise using multiple loudspeakers W. Song 1, M. Marschall 2, J.D.G. Corrales 3 1 Brüel & Kjær Sound & Vibration Measurement A/S, Denmark, Email: woo-keun.song@bksv.com

More information

The Why and How of With-Height Surround Sound

The Why and How of With-Height Surround Sound The Why and How of With-Height Surround Sound Jörn Nettingsmeier freelance audio engineer Essen, Germany 1 Your next 45 minutes on the graveyard shift this lovely Saturday

More information

Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings

Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings Banu Gunel, Huseyin Hacihabiboglu and Ahmet Kondoz I-Lab Multimedia

More information

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands

Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May Amsterdam, The Netherlands Audio Engineering Society Convention Paper Presented at the 110th Convention 2001 May 12 15 Amsterdam, The Netherlands This convention paper has been reproduced from the author's advance manuscript, without

More information

Multi-Loudspeaker Reproduction: Surround Sound

Multi-Loudspeaker Reproduction: Surround Sound Multi-Loudspeaker Reproduction: urround ound Understanding Dialog? tereo film L R No Delay causes echolike disturbance Yes Experience with stereo sound for film revealed that the intelligibility of dialog

More information

Rec. ITU-R F RECOMMENDATION ITU-R F *

Rec. ITU-R F RECOMMENDATION ITU-R F * Rec. ITU-R F.162-3 1 RECOMMENDATION ITU-R F.162-3 * Rec. ITU-R F.162-3 USE OF DIRECTIONAL TRANSMITTING ANTENNAS IN THE FIXED SERVICE OPERATING IN BANDS BELOW ABOUT 30 MHz (Question 150/9) (1953-1956-1966-1970-1992)

More information

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Downloaded from orbit.dtu.dk on: Feb 05, 2018 The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Käsbach, Johannes;

More information

Development of multichannel single-unit microphone using shotgun microphone array

Development of multichannel single-unit microphone using shotgun microphone array PROCEEDINGS of the 22 nd International Congress on Acoustics Electroacoustics and Audio Engineering: Paper ICA2016-155 Development of multichannel single-unit microphone using shotgun microphone array

More information

SOUND COLOUR PROPERTIES OF WFS AND STEREO

SOUND COLOUR PROPERTIES OF WFS AND STEREO SOUND COLOUR PROPERTIES OF WFS AND STEREO Helmut Wittek Schoeps Mikrofone GmbH / Institut für Rundfunktechnik GmbH / University of Surrey, Guildford, UK Spitalstr.20, 76227 Karlsruhe-Durlach email: wittek@hauptmikrofon.de

More information

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE Copyright SFA - InterNoise 2000 1 inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering 27-30 August 2000, Nice, FRANCE I-INCE Classification: 6.1 AUDIBILITY OF COMPLEX

More information

Recent Advances in Acoustic Signal Extraction and Dereverberation

Recent Advances in Acoustic Signal Extraction and Dereverberation Recent Advances in Acoustic Signal Extraction and Dereverberation Emanuël Habets Erlangen Colloquium 2016 Scenario Spatial Filtering Estimated Desired Signal Undesired sound components: Sensor noise Competing

More information

c 2014 Michael Friedman

c 2014 Michael Friedman c 2014 Michael Friedman CAPTURING SPATIAL AUDIO FROM ARBITRARY MICROPHONE ARRAYS FOR BINAURAL REPRODUCTION BY MICHAEL FRIEDMAN THESIS Submitted in partial fulfillment of the requirements for the degree

More information

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS Myung-Suk Song #1, Cha Zhang 2, Dinei Florencio 3, and Hong-Goo Kang #4 # Department of Electrical and Electronic, Yonsei University Microsoft Research 1 earth112@dsp.yonsei.ac.kr,

More information

A binaural auditory model and applications to spatial sound evaluation

A binaural auditory model and applications to spatial sound evaluation A binaural auditory model and applications to spatial sound evaluation Ma r k o Ta k a n e n 1, Ga ë ta n Lo r h o 2, a n d Mat t i Ka r ja l a i n e n 1 1 Helsinki University of Technology, Dept. of Signal

More information

Combining Subjective and Objective Assessment of Loudspeaker Distortion Marian Liebig Wolfgang Klippel

Combining Subjective and Objective Assessment of Loudspeaker Distortion Marian Liebig Wolfgang Klippel Combining Subjective and Objective Assessment of Loudspeaker Distortion Marian Liebig (m.liebig@klippel.de) Wolfgang Klippel (wklippel@klippel.de) Abstract To reproduce an artist s performance, the loudspeakers

More information

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands

Audio Engineering Society. Convention Paper. Presented at the 124th Convention 2008 May Amsterdam, The Netherlands Audio Engineering Society Convention Paper Presented at the 124th Convention 2008 May 17 20 Amsterdam, The Netherlands The papers at this Convention have been selected on the basis of a submitted abstract

More information

Validation of lateral fraction results in room acoustic measurements

Validation of lateral fraction results in room acoustic measurements Validation of lateral fraction results in room acoustic measurements Daniel PROTHEROE 1 ; Christopher DAY 2 1, 2 Marshall Day Acoustics, New Zealand ABSTRACT The early lateral energy fraction (LF) is one

More information

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis

Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis Predicting localization accuracy for stereophonic downmixes in Wave Field Synthesis Hagen Wierstorf Assessment of IP-based Applications, T-Labs, Technische Universität Berlin, Berlin, Germany. Sascha Spors

More information

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011 396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011 Obtaining Binaural Room Impulse Responses From B-Format Impulse Responses Using Frequency-Dependent Coherence

More information

Sound source localization accuracy of ambisonic microphone in anechoic conditions

Sound source localization accuracy of ambisonic microphone in anechoic conditions Sound source localization accuracy of ambisonic microphone in anechoic conditions Pawel MALECKI 1 ; 1 AGH University of Science and Technology in Krakow, Poland ABSTRACT The paper presents results of determination

More information

Josephson Engineering, Inc.

Josephson Engineering, Inc. C700 Users Guide Josephson Engineering, Inc. 329A Ingalls Street Santa Cruz, California +1 831 420 0888 www.josephson.com 2014 Josephson Engineering This Guide was previously published as the Series Seven

More information

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION ARCHIVES OF ACOUSTICS 33, 4, 413 422 (2008) VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION Michael VORLÄNDER RWTH Aachen University Institute of Technical Acoustics 52056 Aachen,

More information

A study on sound source apparent shape and wideness

A study on sound source apparent shape and wideness University of Wollongong Research Online aculty of Informatics - Papers (Archive) aculty of Engineering and Information Sciences 2003 A study on sound source apparent shape and wideness Guillaume Potard

More information

ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES

ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES Abstract ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES William L. Martens Faculty of Architecture, Design and Planning University of Sydney, Sydney NSW 2006, Australia

More information

NTT DOCOMO Technical Journal. Method for Measuring Base Station Antenna Radiation Characteristics in Anechoic Chamber. 1.

NTT DOCOMO Technical Journal. Method for Measuring Base Station Antenna Radiation Characteristics in Anechoic Chamber. 1. Base Station Antenna Directivity Gain Method for Measuring Base Station Antenna Radiation Characteristics in Anechoic Chamber Base station antennas tend to be long compared to the wavelengths at which

More information

Laboratory 1: Uncertainty Analysis

Laboratory 1: Uncertainty Analysis University of Alabama Department of Physics and Astronomy PH101 / LeClair May 26, 2014 Laboratory 1: Uncertainty Analysis Hypothesis: A statistical analysis including both mean and standard deviation can

More information

The role of intrinsic masker fluctuations on the spectral spread of masking

The role of intrinsic masker fluctuations on the spectral spread of masking The role of intrinsic masker fluctuations on the spectral spread of masking Steven van de Par Philips Research, Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands, Steven.van.de.Par@philips.com, Armin

More information

JOHANN CATTY CETIM, 52 Avenue Félix Louat, Senlis Cedex, France. What is the effect of operating conditions on the result of the testing?

JOHANN CATTY CETIM, 52 Avenue Félix Louat, Senlis Cedex, France. What is the effect of operating conditions on the result of the testing? ACOUSTIC EMISSION TESTING - DEFINING A NEW STANDARD OF ACOUSTIC EMISSION TESTING FOR PRESSURE VESSELS Part 2: Performance analysis of different configurations of real case testing and recommendations for

More information

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Spatial Audio Reproduction: Towards Individualized Binaural Sound Spatial Audio Reproduction: Towards Individualized Binaural Sound WILLIAM G. GARDNER Wave Arts, Inc. Arlington, Massachusetts INTRODUCTION The compact disc (CD) format records audio with 16-bit resolution

More information

High-speed Noise Cancellation with Microphone Array

High-speed Noise Cancellation with Microphone Array Noise Cancellation a Posteriori Probability, Maximum Criteria Independent Component Analysis High-speed Noise Cancellation with Microphone Array We propose the use of a microphone array based on independent

More information

O P S I. ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis )

O P S I. ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis ) O P S I ( Optimised Phantom Source Imaging of the high frequency content of virtual sources in Wave Field Synthesis ) A Hybrid WFS / Phantom Source Solution to avoid Spatial aliasing (patentiert 2002)

More information

MONOPHONIC SOURCE LOCALIZATION FOR A DISTRIBUTED AUDIENCE IN A SMALL CONCERT HALL

MONOPHONIC SOURCE LOCALIZATION FOR A DISTRIBUTED AUDIENCE IN A SMALL CONCERT HALL MONOPHONIC SOURCE LOCALIZATION FOR A DISTRIBUTED AUDIENCE IN A SMALL CONCERT HALL Enda Bates, Gavin Kearney, Frank Boland and Dermot Furlong Department of Electronic and Electrical Engineering Trinity

More information

The psychoacoustics of reverberation

The psychoacoustics of reverberation The psychoacoustics of reverberation Steven van de Par Steven.van.de.Par@uni-oldenburg.de July 19, 2016 Thanks to Julian Grosse and Andreas Häußler 2016 AES International Conference on Sound Field Control

More information

THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS

THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS by John David Moore A thesis submitted to the University of Huddersfield in partial fulfilment of the requirements for the degree

More information

3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES

3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES 3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES Rishabh Gupta, Bhan Lam, Joo-Young Hong, Zhen-Ting Ong, Woon-Seng Gan, Shyh Hao Chong, Jing Feng Nanyang Technological University,

More information

Tutorial on the Statistical Basis of ACE-PT Inc. s Proficiency Testing Schemes

Tutorial on the Statistical Basis of ACE-PT Inc. s Proficiency Testing Schemes Tutorial on the Statistical Basis of ACE-PT Inc. s Proficiency Testing Schemes Note: For the benefit of those who are not familiar with details of ISO 13528:2015 and with the underlying statistical principles

More information

Convention Paper 6230

Convention Paper 6230 Audio Engineering Society Convention Paper 6230 Presented at the 117th Convention 2004 October 28 31 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript,

More information

Computational Perception /785

Computational Perception /785 Computational Perception 15-485/785 Assignment 1 Sound Localization due: Thursday, Jan. 31 Introduction This assignment focuses on sound localization. You will develop Matlab programs that synthesize sounds

More information

Subband Analysis of Time Delay Estimation in STFT Domain

Subband Analysis of Time Delay Estimation in STFT Domain PAGE 211 Subband Analysis of Time Delay Estimation in STFT Domain S. Wang, D. Sen and W. Lu School of Electrical Engineering & Telecommunications University of ew South Wales, Sydney, Australia sh.wang@student.unsw.edu.au,

More information

Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ

Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ Synthesised Surround Sound Department of Electronics and Computer Science University of Southampton, Southampton, SO17 2GQ Author Abstract This paper discusses the concept of producing surround sound with

More information

Enhancing 3D Audio Using Blind Bandwidth Extension

Enhancing 3D Audio Using Blind Bandwidth Extension Enhancing 3D Audio Using Blind Bandwidth Extension (PREPRINT) Tim Habigt, Marko Ðurković, Martin Rothbucher, and Klaus Diepold Institute for Data Processing, Technische Universität München, 829 München,

More information

Reducing comb filtering on different musical instruments using time delay estimation

Reducing comb filtering on different musical instruments using time delay estimation Reducing comb filtering on different musical instruments using time delay estimation Alice Clifford and Josh Reiss Queen Mary, University of London alice.clifford@eecs.qmul.ac.uk Abstract Comb filtering

More information

DECORRELATION TECHNIQUES FOR THE RENDERING OF APPARENT SOUND SOURCE WIDTH IN 3D AUDIO DISPLAYS. Guillaume Potard, Ian Burnett

DECORRELATION TECHNIQUES FOR THE RENDERING OF APPARENT SOUND SOURCE WIDTH IN 3D AUDIO DISPLAYS. Guillaume Potard, Ian Burnett 04 DAFx DECORRELATION TECHNIQUES FOR THE RENDERING OF APPARENT SOUND SOURCE WIDTH IN 3D AUDIO DISPLAYS Guillaume Potard, Ian Burnett School of Electrical, Computer and Telecommunications Engineering University

More information

Determining MTF with a Slant Edge Target ABSTRACT AND INTRODUCTION

Determining MTF with a Slant Edge Target ABSTRACT AND INTRODUCTION Determining MTF with a Slant Edge Target Douglas A. Kerr Issue 2 October 13, 2010 ABSTRACT AND INTRODUCTION The modulation transfer function (MTF) of a photographic lens tells us how effectively the lens

More information

Simulation of wave field synthesis

Simulation of wave field synthesis Simulation of wave field synthesis F. Völk, J. Konradl and H. Fastl AG Technische Akustik, MMK, TU München, Arcisstr. 21, 80333 München, Germany florian.voelk@mytum.de 1165 Wave field synthesis utilizes

More information

UNIT I FUNDAMENTALS OF ANALOG COMMUNICATION Introduction In the Microbroadcasting services, a reliable radio communication system is of vital importance. The swiftly moving operations of modern communities

More information

Rec. ITU-R F RECOMMENDATION ITU-R F *,**

Rec. ITU-R F RECOMMENDATION ITU-R F *,** Rec. ITU-R F.240-6 1 RECOMMENDATION ITU-R F.240-6 *,** SIGNAL-TO-INTERFERENCE PROTECTION RATIOS FOR VARIOUS CLASSES OF EMISSION IN THE FIXED SERVICE BELOW ABOUT 30 MHz (Question 143/9) Rec. ITU-R F.240-6

More information

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION

IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION IMPLEMENTATION AND APPLICATION OF A BINAURAL HEARING MODEL TO THE OBJECTIVE EVALUATION OF SPATIAL IMPRESSION RUSSELL MASON Institute of Sound Recording, University of Surrey, Guildford, UK r.mason@surrey.ac.uk

More information

Convention Paper 7057

Convention Paper 7057 Audio Engineering Society Convention Paper 7057 Presented at the 122nd Convention 2007 May 5 8 Vienna, Austria The papers at this Convention have been selected on the basis of a submitted abstract and

More information

6-channel recording/reproduction system for 3-dimensional auralization of sound fields

6-channel recording/reproduction system for 3-dimensional auralization of sound fields Acoust. Sci. & Tech. 23, 2 (2002) TECHNICAL REPORT 6-channel recording/reproduction system for 3-dimensional auralization of sound fields Sakae Yokoyama 1;*, Kanako Ueno 2;{, Shinichi Sakamoto 2;{ and

More information

Appendix III Graphs in the Introductory Physics Laboratory

Appendix III Graphs in the Introductory Physics Laboratory Appendix III Graphs in the Introductory Physics Laboratory 1. Introduction One of the purposes of the introductory physics laboratory is to train the student in the presentation and analysis of experimental

More information

Holographic Measurement of the 3D Sound Field using Near-Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

Holographic Measurement of the 3D Sound Field using Near-Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch Holographic Measurement of the 3D Sound Field using Near-Field Scanning 2015 by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch KLIPPEL, WARKWYN: Near field scanning, 1 AGENDA 1. Pros

More information

SPATIAL SOUND REPRODUCTION WITH WAVE FIELD SYNTHESIS

SPATIAL SOUND REPRODUCTION WITH WAVE FIELD SYNTHESIS AES Italian Section Annual Meeting Como, November 3-5, 2005 ANNUAL MEETING 2005 Paper: 05005 Como, 3-5 November Politecnico di MILANO SPATIAL SOUND REPRODUCTION WITH WAVE FIELD SYNTHESIS RUDOLF RABENSTEIN,

More information

ME scope Application Note 01 The FFT, Leakage, and Windowing

ME scope Application Note 01 The FFT, Leakage, and Windowing INTRODUCTION ME scope Application Note 01 The FFT, Leakage, and Windowing NOTE: The steps in this Application Note can be duplicated using any Package that includes the VES-3600 Advanced Signal Processing

More information