Proceedings of Meetings on Acoustics Volume 19, 2013 http://cousticlsociety.org/ ICA 2013 Montrel Montrel, Cnd 2-7 June 2013 Signl Processing in Acoustics Session 4SP: Sensor Arry Bemforming nd Its Applictions 4SP2. Sptil sound pick-up with low numer of microphones Julin D. Plcino* nd Rozenn Nicol *Corresponding uthor's ddress: SVQ//TPS, Ornge Ls, 2 Av Pierre Mrzin, Lnnion, 22307, Britny, Frnce, julin.plcino@ornge.com For severl decdes sptil udio hs een only used y movies, music composers nd reserchers in lortories. Becuse of their complexity, people hve lwys een wy from 3D udio techniques. Dedicted devices such s microphone nd loudspeker rrys re expensive nd cnnot e used without some expertise of udio cpturing nd reproduction. Nowdys the min rrier preventing consumer solution from cpturing sptil udio is the ig numer of trnsducers needed to get n ccurte 3D sound imge. In order to rek down this rrier we propose new 3D udio recording set-up which is composed of three-microphone rry le to get the full 3D udio informtion. A 2D version, consisting of two-microphone rry, is lso ville. The sound locliztion is sed on the trnsducer directivities nd dditionl informtion to solve the ngulr miguity. This pper will descrie firstly the microphone set-up nd its ssocited lgorithm. Secondly the performnces of sound locliztion will e ssessed. Pulished y the Acousticl Society of Americ through the Americn Institute of Physics 2013 Acousticl Society of Americ [DOI: 10.1121/1.4800844] Received 22 Jn 2013; pulished 2 Jun 2013 Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 1
INTRODUCTION For severl decdes sptil udio hs een only used y movies, music composers nd reserchers in lortories [1]. Becuse of their complexity, people hve lwys een wy from 3D udio techniques. Dedicted devices such s microphone nd loudspeker rrys re expensive nd cnnot e used without some expertise of udio cpturing nd reproduction. Nowdys the min rrier preventing consumer solution from cpturing sptil udio is the ig numer of trnsducers needed to get n ccurte 3D sound imge [2]. In order to rek down this rrier we propose new 3D udio recording set-up which is composed of three-microphone rry le to get the full 3D udio informtion. A 2D version, consisting of two-microphone rry, is lso ville. The sound locliztion is sed on the trnsducer directivities nd dditionl informtion to solve the ngulr miguity. This pper will descrie firstly the microphone set-up nd its ssocited lgorithm. Secondly the performnces of sound locliztion will e ssessed. SOURCE LOCALIZATION USING MICROPHONE DIRECTIVITY PATTERN Microphone Arry Lyout The microphone rry is composed of 3 crdioid microphones (see Figure 1): the first one pointing to the x xis (right), the second one to the opposite direction (left) nd the third one to the z xis (top). FIGURE 1 Lyout of the microphone device. ) 3D rry. ) 2D rry. Sptil Informtion Achieved from Microphone Directivity The method opertes in the time/frequency domin. It is ssumed tht only one sound source is present t ech moment for single frequency in. Equtions will e presented for fixed frequency. Before pplying FFT, time smples re weighted y soft edge window to void oscilltions in frequency domin. In terms of signl processing, the choice of the window type nd its length is importnt. The slope nd frequency ounding ffect results of ner frequencies. To void instility of the source locliztion, results cn e smoothed frequency nd time wise. The directivity of the n th crdioid microphone is represented y (1) where (2) The vector defines the source direction nd the vector refers to the pointing direction of n th microphone. In this cse, the pointing direction cn e expressed in the Crtesin sis for ech microphone y,,, (3) Source loction direction cn e expressed in the Sphericl or Crtesin coordinte sis, respectively or, y Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 2
(4) where the Sphericl coordintes re defined y rdius, zimuth ngle θ, nd elevtion ngle φ. The directivity functions of the three microphones re [3]:, (5), c Since the direction is unchnged for ny vlue of, rdius is fixed to in following expressions. The sound source produces signl t the sis origin. Assuming tht ech microphone is locted t this point, their output signls re: (6) The signls llow to getting three dt: 1) The monophonic signl of the sound source, y using reltions (5). (5). nd (6) (7) 2) The elevtion ngle of the sound source, y using equtions (5).c nd (6) (8) 3) The zimuth ngle of the sound source, y using reltions (5). (5). (9) Alterntely it is possile to use other directivity pttern microphones to reconstruct crdioid directivity virtully (see Section: Synthesizing Virtul Crdioid Microphones ). The sme method cn e used on 2D version using only the two microphones corresponding to the horizontl plne. In this cse n ritrry elevtion must e fixed. This introduces n error incresing with the ngulr mismtch from the chosen elevtion nd the rel position. In this first step, source loction is clculted using exclusively the microphone directivity pttern. However the zimuth is estimted with sign miguity (front-rer) due to the cosine of Eqution (9). This miguity cn e solved y moving microphones perpendiculrly to their pointing direction, which will e illustrted in the next section. Front Rer Amiguity Resolution Using Time Dely If we consider now tht the n th microphone is locted t the position defined y the vector: (10) its output signl,, ecomes: (11) where is the dely induced y the distnce etween the n th microphone nd the sis origin given y (12) From eqution (4) the dely ecomes. (13) In frequency domin ecomes (where is the ngulr frequency with the time frequency) y pplying Fourier Trnsform, (14) with. (15) Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 3
Eqution (14) ecomes. (16) Consequently (17) nd. (18) The time dely etween the two microphones is expressed y:. (19) Since this result is used to solve the miguity in reltion (9) only its sign is needed. It is inserted in this ltter s: (20) SOUND LOCALIZATION USING COINCIDENT BI-DIRECTIONAL MICROPHONES FIGURE 2 Lyout of the coincident microphone device. ) 3D rry. ) 2D rry. Now it will e shown how to use the equtions (7), (8) nd (9) with virtul crdioid microphones synthesized from idirectionl microphones. The method is descried here for n rry composed of two idirectionl microphones nd crdioid one. Microphone Lyout The rry is composed of two idirectionl microphones pointing x nd y xis over the horizontl plne nd third crdioid microphone pointing to the Z xis (see FIGURE 2). It should e noticed tht lterntely soundfield microphone [2] could e used since the X nd Y components of B-formt re equivlent to the idirectionl signl descried ove. Synthesizing Virtul Crdioid Microphones The signls delivered y the three microphones re [3]: c The virtul signls re otined using the expressions: (21) (22) Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 4
where the pressure signl is estimted y: Alterntely, if Soundfield B-formt signl is used (24) nd (25) where the signl Z(t) refers to the Z component of the B-formt. Sound Locliztion Using Acoustic Intensity The signls of the virtul crdioid microphones (eq.(22)) llow to estimting the elevtion nd zimuth ngle of the sound source using reltions (8) nd (9) ut with front-rer miguity which cn not e solved y introducing time dely since coincident microphone rry is used. Insted sptil informtion will e otined from coustic intensity. Acoustic intensity vector is linked to the coustic pressure nd coustic velocity y the reltion [5]: (26) where is the complex conjugted of the coustic pressure nd, et re the,, components of the coustic velocity [5]. In the cse of progressive plne wve the coustic pressure is expressed y : (27) where is the wve vector. Euler s eqution gives the coustic velocity s function the coustic pressure: (28) where is medium density nd the speed of sound. Therefore coustic intensity components re (29) where represents, or. Thus it is oserved tht the coustic intensity vector hs the sme direction s, nd cn therefore e used to estimte the direction of the sound source. Bidirectionl coincident rry deliver pressure grdient informtion which leds directly to the coustic velocity. For instnce, the horizontl plne projection of velocity is given y: (30) From Eqution (26), the coustic intensity components re: (31) Elevtion nd zimuth ngle re then otined from Eqution (29) [6]: (32) (23) Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 5
Solving Amiguity Locliztion for Coincident Arrys FIGURE 3 Angle estimtion miguity: Crdioid directivity method (continuous lue line; see Eqution (9)), Acoustic intensity method (dotted green line; see Eqution (32)), Theoreticl position (red dshed line) TABLE 1 Amiguity resolution y cross-checking the ngulr estimtion from the directivity (Eqution (9)) nd intensity (Eqution (32)) methods rel estimted opertion to solve miguity Directivity Intensity Directivity Intensity Locliztion sed on the crdioid directivity llows to otining the zimuth ngle with front rer miguity due to the cosine reltion involved in its estimtion (eq.(9)). This estimtion cn e solved for non coincident rrys y inserting dely. For coincident rrys it is possile to use coustic intensity to estimte zimuth ngle, ut this time with left - right miguity due to the inverse of the tngent in the reltion (32).. As shown y FIGURE 3 nd TABLE 1, the front - rer miguity is complementry to the left - right miguity. Four cses re pointed out, corresponding to the four comintions of the two miguous estimtions. The rel position cn then e found using conditionl reserch. In theory once the miguity is solved, oth methods (i.e Eqution (9) or Eqution (32)) give the sme result. However they my e slightly different in prctice. Depending on the sound scene, the sound stimulus or the noise level, one method cn chieve etter performnces. LOCALIZATION ASSESSMENT A computer progrm stimultes the signls which would hve een recorded y the two microphone rry setups previously descried. Vrious stimuli were used (music, rndom noise, noise nd nd hrmonic tone). Evlution Criteri The zimuth nd elevtion error nd re clculted here s the ngulr distnce etween the rel loction nd the estimted one. The totl error is the ngulr distnce etween the rel nd the estimted loction on the sphere (see eq(33)). The ngulr distnce is clculted using the sclr product of the rel nd estimted position. Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 6
where (33), (34), When or re clculted, nd, or nd component re fixed to 0 respectively. The error is expressed in degrees where 0 sttes the est estimtion nd 180 the worst one. In ddition new criterion is proposed, computed s the error level otined y t lest 75% of 1/3 octve spectrum. It will e referred to s. Results c d FIGURE 4 Source locliztion of rndom noise moving round from ottom to the top of 3 crdioid microphones rry. Horizontl plne microphones re seprted y 2cm. ) Source loction t 1036 Hz, zimuth (lue) nd elevtion (green). ) criterion, zimuth (lue), elevtion (green), totl (red) c) Elevtion source loction error. d) Azimuth locliztion error. FIGURE 4 depicts the results otined when loclizing rndom noise moving round nd from ottom to the top. Locliztion ccurcy is ffected y the microphone spcing ecuse the reconstruction of the omnidirectionl pressure is ltered. As shown in FIGURE 4c, high frequencies re more ffected when the wvelength is closer to the microphone distnce. For high elevtions vritions of the crdioid pttern ( ) re slow which results in poor ngulr discrimintion. As zimuth locliztion uses elevtion (cf. eq. (20)), zimuth estimtion is consequently degrded, s shown y FIGURE 4d. A B c FIGURE 5 Source locliztion of rndom noise in zimuth (lue) nd elevtion (green) picked up y 2 idirectionl + 1crdioid microphone rry t 1036 Hz found with ). directivity only, ) intensity only, c) directivity nd intensity When coincident rrys re used, estimtion of the pressure signl is more ccurte (see eq.(23)) nd results re not ffected with elevtion over ll frequencies (see FIGURE 5). However, in prctice coincident microphone rrys re impossile to uild. As consequence some rtifcts will lwys occur in high frequencies. As it hs een specified efore, it is ssumed tht only one source is present t ech moment nd t ech frequency. In prctice, coustic field is complex nd more thn one source is present ffecting locliztion ccurcy (see FIGURE 6 nd FIGURE 7). Low energy signls close to high energy signls re then loclized to the direction of the higher one. When the trget sound source level is higher thn 12 db to disturing noise, locliztion error is Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 7
A B c FIGURE 6 Source locliztion evlution of rndom noise: Elevtion error (1 st row), Azimuth error (2 nd row) nd criterion (3 rd row) picked up with 3 crdioid microphone rry in presence of disturing second noise source t =0 nd =0 nd level difference of ). 0 db, ) -12 db, c) -20 db wek: Some front-rer confusions re introduced elow 500 Hz since phse informtion of the trget source is ltered y disturing second source. Oviously this is only oserved in the cse of the crdioid rry. The influence of disturing source ecomes insignificnt for level differences higher thn 20 db. Contrsting with FIGURE 6, it is oserved for the idirectionl rry tht error increses unexpectedly for low elevtions (see FIGURE 7-3c). Indeed the trget source level is disdvntged y the rry directivity which is def t those directions. This phenomenon cn e turned into dvntge if the disturing noise is plced t the def re of the rry. On the contrry for crdioid rry the trget source is homogenously picked-up t ll directions. As suggested y FIGURE 6, zimuth error hs not the sme impct in terms of totl ngulr distnces in function of the elevtion ngle. FIGURE 7 clerly shows tht zimuth error hs less impct when the source is ner to the poles. CONCLUSION AND FUTURE WORKS In order to provide 3D udio tools for consumer device, we propose recording solution using smll numer of microphones. The omnidirectionl pressure signl is recomposed nd compred with the directionl microphone output. Sound source loction is estimted in zimuth nd elevtion using microphone directivity. However the locliztion is front-rer miguous. For non-coincident rrys, this miguity is solved y time difference etween microphones, wheres for coincident rrys, the coustic intensity vector is used. In the cse of non-coincident rrys, the omnidirectionl pressure component is not properly reconstructed t wvelengths closer to microphone spcing, which lters the locliztion. Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 8
c FIGURE 7 Source locliztion evlution of rndom noise : Elevtion error (1 st row), zimuth error (2 nd row) nd criterion (3 rd row) picked up with 2 idirectionl + 1 crdioid microphone rry in presence of disturing second noise source t =0 nd =0 nd level difference of ). 0 db, ) -12 db, c) -20 db Sound scene nlysis presented in this pper cn e used s the first step of oject sed sptil udio representtion. Using the informtion of source position, it is possile to render the sound scene over ny type of sptil udio system such s stereo, 5.1, 7.1, 22.2, Higher Order Amisonics [7] or Wve Field Synthesis [8]. REFERENCES [1]. J. Sunier, The story of stereo: 1881-. Gernsck Lirry, 1960. [2]. R. Nicol, «Représenttion et perception des espces uditifs virtuels», HDR, Université du Mine, Le Mns, Frnce, 2010. [3]. J. Jouhneu, Notions élémentires d coustique: Électrocoustique. Tec & Doc Lvoisier, 1999. [4]. Michel A. Gerzon et Peter G. Crven, «Coincident microphone simultion covering three dimensionl spce nd yelding vrious directionl outputs», U.S. Ptent 4,042,77916-oût-1977. [5]. M. Bruneu, Mnuel d coustique fondmentle. Hermès, 1998. [6]. V. Pulkki, «Directionl udio coding in sptil sound reproduction nd stereo upmixing», in Proc. of the AES 28th Int. Conf, Pite, Sweden, 2006. [7]. J. Dniel, «Evolving views on HOA: From technologicl to prgmtic concerns», Amisonics Symposium 2009, June 25-27, Grz, 2009. [8]. A.J. Berkhout, D. de Vries & P. Vogel, «Acoustic Control y Wve Field Synthesis», J. Acoust. Soc. Am., 1993, 93, pp. 2764-2778. Proceedings of Meetings on Acoustics, Vol. 19, 055078 (2013) Pge 9