The effects of the excitation source directivity on some room acoustic descriptors obtained from impulse response measurements

PROCEEDINGS of the 22 nd International Congress on Acoustics Challenges and Solutions in Acoustical Measurements and Design: Paper ICA2016-484 The effects of the excitation source directivity on some room acoustic descriptors obtained from impulse response measurements Luana Torquete Lara (a), Alexander Mattioli Pasqual (b), Marco Antônio de Mendonça Vecci (c) (a) Federal University of Minas Gerais, Graduate Program in Electrical Engineering, Brazil, luanatorquete@gmail.com (b) Federal University of Minas Gerais, Department of Mechanical Engineering, Brazil, ampasqual@demec.ufmg.br (c) Federal University of Minas Gerais, Department of Structural Engineering, Brazil, vecci@dees.ufmg.br Abstract Reverberation time, early decay time, center time, clarity, and definition are classical objective metrics to assess the acoustic performance of closed spaces. These parameters can be evaluated from the room impulse response, which is usually estimated through measurements of the sound pressure field produced by an omnidirectional loudspeaker array inside the room. However, typical sound sources in a real situation will never be omnidirectional. Therefore, the question arises whether or not the so-obtained room acoustic descriptors are still meaningful for different source directivities. In order to shed light on this question, we conducted a set of measurements using a directivity controlled sound source, which is a compact array of independently driven loudspeakers, so that it can operate as a pole, dipole or pole source by controlling the signals sent to the loudspeakers. The impulse responses of two classrooms of the same volume with and without acoustic conditioning and the corresponding room descriptors were experimentally obtained for different source directivities, namely, pole, dipole with several distinct orientations and a pole. We observed that the directivity affects significantly the room acoustic parameters, regardless of the acoustic conditioning. Also, the descriptors more sensitive to directivity changes were those related to the balance between early and late arriving energy, such as clarity, definition, and center time. These results show that the directional characteristic of the sound source plays an important role in room acoustics, and thus it cannot be neglected when designing and/or assessing a closed space. Keywords: room impulse response; source directivity; acoustic descriptors

The effects of the excitation source directivity on some room acoustic descriptors obtained from impulse response measurements 1 Introduction In designing and evaluating the acoustics of a room, the relation between the physical properties of the sound field and the subjective sensations of the listener plays a major role [1]. Indeed, a number of acoustic descriptors (such as early decay time, definition, and clarity index) have been proposed to approximately evaluate some subjective aspects of the human perception. Most of such descriptors can be derived from the room impulse response (RIR) and, generally speaking, are associated to the time history of the energy inside the room. The RIR can be estimated by exciting the room with a loudspeaker, which is driven by a known voltage signal, and then microphones measure the acoustic pressure. Among the excitation signals more often used in room acoustics, the exponential sine sweep has been proven to provide good results for empty rooms with low background noise [2, 3, 4]. As far as the actuator is concerned, it is a common practice to use a dodecahedral loudspeaker array in order to approximately achieve an omnidirectional radiation pattern [5, 6]. However, natural sound sources in rooms will be hardly omnidirectional, so that the question arises whether or not the so-obtained room acoustic descriptors are still meaningful for different source directivities. This work describes a set of room acoustic experiments we conducted to investigate the effects of the source directivity on the following descriptors: reverberation time, early decay time, and the parameters related to the balance between early and late arriving energy, especially the definition. A directivity controlled source made up of four independent loudspeakers was used to excite the room with a given exponential sine sweep signal, but with different radiation patterns. Then, for each directivity, the acoustic descriptors were obtained from the estimated RIRs. The experiments were carried out in two classrooms of the Federal University of Minas Gerais (Belo Horizonte/MG, Brazil), which possess the same dimensions (enclosed volume of 180 m 3 approximately), but one of them is acoustically conditioned. The paper is organized as follows. Section 2 gives an overview on the RIR and some room acoustic descriptors. Next, the measurement protocol is described in Section 3. Then, Sections 4 and 5 present the experimental results and the conclusions, respectively. 2 Room acoustics 2.1 Impulse response As far as acoustic phenomena are concerned, linearity and time invariance are commonly verified assumptions in a room, so that most of the acoustic descriptors can be obtained from its impulse response h(t). In this paper, the RIR is experimentally obtained by using the method presented in Ref. [2]. Accordingly, the input signal is an exponential sine sweep, 2

where t is time, T is the signal duration, and ω 1 and ω 2 are the initial and final angular frequencies, respectively. A representative spectrogram of x(t) is given at the left side of Figure 1. This signal has a pink-noise-like spectrum (3 db/octave decay), and thus it improves the signal-to-noise ratio at low frequencies, which is usually more critical due to the poor loudspeaker responses in the low-frequency range. Because x(t) is not an impulse, the inverse signal, x inv (t), must be used to derive h(t). This inverse signal is defined such that x(t) x inv (t) = δ(t), where δ(t) is the unity impulse function, and denotes convolution. Therefore, if y(t) = h(t) x(t) is the system response to the exponential sine sweep, the convolution of the inverse signal with y(t) yields the impulse response, i.e., h(t) = y(t) x inv (t). There are two main advantages of working with the signals directly in the time domain instead of transforming them to the frequency domain via the discrete Fourier transform (DFT). First, there is no time-domain aliasing due to the circular convolution related to the system identification via DFTs. Second, and more important, the loudspeaker non-linearities can be easily removed from the estimated h(t), as described in Ref. [2]. However, the computation cost to evaluate the linear convolution is significantly larger than computing the DFTs via FFT (fast Fourier transform) algorithms. Figure 1: Spectrogram of the excitation signal (on the left) and its inverse (on the right) The inverse signal is obtained by time-reversing x(t), time-shifting t = -t + T to satisfy the causality requirement, and magnitude shaping to compensate for the 3 db/octave decay rate. By doing so, it can be shown that where 3

A representative spectrogram of the inverse signal is given at the right side of Figure 1. In this work, the exponential sine sweep signal, x(t), is used to excite the room via loudspeakers. Then, the measured response via microphones, y(t), is convolved with the inverse signal, x inv (t), which gives the estimated RIR. Next, the RIR is filtered in frequency bands. Finally, the room descriptors are obtained. Sixth-order Butterworth octave-band digital filters (class 0) are used, which were obtained through the fdesign.octave function from Matlab. The common center frequencies are adopted, namely, 125 Hz, 250 Hz, 500 Hz, 1 khz, 2 khz, 4 khz and 8 khz. 2.2 Acoustic descriptors 2.2.1 Reverberation time and early decay time The ISO 3382 defines the reverberation time as the time, expressed in seconds, that would be required for the sound pressure level to decrease by 60 db, at a rate of decay given by the linear least-squares regression of the measured decay curve from a level 5 db below the initial level to 35 db below. Notice that the signal dynamic range is not required to be 60 db: a significant smaller dynamic range can be used to derive the decay rate, and then the reverberation time can be obtained by extrapolation. If the decay rate is calculated by taking the range from 5 db to 35 db, the reverberation time is labeled T 30. Similarly, if the range from 5 db to 25 db is considered, the reverberation time is labeled T 20 [7]. In order to provide good intelligibility, a small reverberation time is desirable in rooms for speech communication, such as classrooms and meeting rooms. The so-called Schroeder s integration method leads to the decay curve, E(t), once the RIR has been obtained [8]: The infinity integration limit corresponds to the ideal situation with no background noise. In practice, the integration limit is defined by the amount of background noise. As the reverberation time, the early decay time (EDT) is the time required for the sound pressure level to decrease by 60 db. However, EDT is calculated by considering only a 10 db dynamic range of the decay curve, from 0 db to 10 db. EDT is more related to reverberation human perception, whereas reverberation time is more related to physical attributes of the room [7]. 2.2.2 Center time, definition, and clarity index These parameters are used to evaluate the balance between early and late arriving energy. The center time, T s, is the time of the gravity center of the squared RIR. It is given by The definition, D, is the ratio between the energy contained in the first 50 ms of the RIR and the total energy of the signal [7]: 4

Center time and definition are useful descriptors to evaluate speech intelligibility, which benefits from small T s values or large D values, which correspond to more early energy, i.e., less perceived reverberance. The clarity index is similar to the definition, but the first 80 ms (which is more suitable to music) are used instead of 50 ms, and it is given in db. If clarity is evaluated by considering only the first 50 ms, it is labeled C50 [1]. In this case, definition and clarity give the same information. Indeed, it can be shown that 3 Methodology 3.1 Equipment and software The following equipment and software were used in this work: Directivity controlled sound source (see Figure 2): it is an array of four 2-in electrodynamic loudspeakers mounted on a cubic cabinet of side 100 mm; designed and built by the authors; Microphones: Behringer ECM8000, omnidirectional; Multichannel audio interface: PreSonus FireStudio Project; Multichannel class-d audio amplifier: Nashville ; MacBook running Pure Data and Matlab. 3.2 Experimental procedures The experimental protocol is depicted in Figure 2. An exponential sine sweep (see Section 2.1) with initial and final frequency of 50 Hz and 20 khz, duration of 15 s, and sampling frequency of 44100 samples/s was generated in Matlab and recorded on a file in the wav format. During the experiments, the laptop running Pure Data reads this file, computes the loudspeaker signals from a desirable directivity informed by the user, and sends them to the audio interface, which performs the D/A conversion and feeds the loudspeakers via the audio amplifiers. Some microphones distributed inside the room sense the acoustic signal and send them to the audio interface via its analog inputs. The audio interface performs the A/D conversion and communicates with Pure Data, which records the measured signals in the wav format. Afterwards, in an off-line post-processing step, the wav files are read in Matlab, and the RIRs and corresponding acoustic descriptors are calculated as described in Section 2. Measurements were carried out in November 2014 in two classrooms of the Engineering School of the Federal University of Minas Gerais: room 2418b and room 4408. These rooms present almost the same dimensions, as shown in Figure 3; the heights of the rooms 2418b and 4408 are 3.50 m and 3.36 m, respectively. The rooms volume is 180 m 3 approximately. However, 5

unlike room 2418b, room 4408 is acoustically conditioned by absorbing panels placed on the rear wall (yellow area in Figure 3) and on the ceiling. In rooms 2418b and 4408, four and eight microphones were used, respectively; their positions (P1 to P8) are given in Table 1 and Figure 3. The measurements were conducted with two persons inside the room, with the usual furniture, and with the door and windows closed. The latter are located on the wall X = 0 for both rooms (see Figure 3). Laptop Multichannel audio interface Multichannel audio amplifier Microphones Figure 2: Schema of the hardware used in the experiments, with a picture of the directivity controlled sound source blackboard blackboard window window acoustic panel Figure 3: Source and microphone positions in rooms 2418b (on the left) and 4408 (on the right) Four basic source directivities were pre-programmed on Pure Data: pole mode, dipole mode on the X-axis, dipole mode on the Y-axis, and lateral pole mode. These modes can be obtained with ease by letting the loudspeakers operate in-phase or out-of-phase, as shown in Table 2, where the symbols +1 and 1 indicate the phase relations, and 0 means no signal. It is worth mentioning that these are not truly pole, dipoles, and pole radiation patterns, but they provide a good approximation in the low-frequency range due to the 6

small spacing between the loudspeakers. The positions of the loudspeakers inside the rooms are shown in Figure 3 (AF1 to AF4). Table 1: Microphone positions Room 2418b Room 4408 Microphone X (m) Y (m) X (m) Y (m) P1 3.60 3.2 3.62 3.24 P2 1.62 4.10 1.55 4.38 P3 6.67 3.20 6.10 4.38 P4 5.52 1.47 6.20 2.72 P5 - - 4.52 1.23 P6 - - 2.42 1.23 P7 - - 1.15 2.76 P8 - - 0.61 1.08 Table 2: Directivity-modes and corresponding phase relations of the loudspeakers Directivity-mode AF1 AF2 AF3 AF4 - Monopole +1 +1 +1 +1 Axis Y Dipole AF3-AF1 +1 0-1 0 Axis X Dipole AF4-AF2 0 +1 0-1 - Quadrupole +1-1 +1-1 Different directivities can be obtained by linear combination of the pre-programmed modes. In particular, a dipole-mode with any desirable orientation in the horizontal plane can be generated by properly combining the two orthogonal pre-programmed dipoles. In the experiments described here, this was carried out to obtain dipoles with the following orientations related to the negative Y-axis: 0º, 15º, 30º, 45º, 60º, 75º, 90º, -75º, -60º, -45º, -30º, -15º. In addition to these 12 dipole-like directivities, the pole-like and a pole-like directivity were considered, so that the RIRs to 14 different source directivities were obtained. For each directivity, we conducted five measurements in room 2418b, and three measurements in room 4408. 4 Results For both classrooms under investigation, we observed that the parameters T 20 and EDT do not change significantly with microphone position. Therefore, Figures 4 and 5 show spatial averaged values as a function of source directivity per octave band. As expected, inspection of Figures 4 and 5 reveals that T 20 and EDT decrease as frequency increases. Also, room 4408 presents T 20 and EDT values significantly smaller than room 2418b. Moreover, the source directivity seems not to play an important role in T 20 and EDT. Because 7

the pole is a very inefficient radiator at low frequencies, we did not obtain a signal-tonoise ratio high enough at 250 Hz, and thus the large T 20 values obtained for the pole at this octave band are not meaningful. It is worth noting that ANSI S12.60 [9] recommends a maximum allowable reverberation time of 0.6 s for core learning rooms with volumes smaller than 283 m 3 in the octave bands of 500, 1000, and 2000 Hz. Figure 4 shows that the room 2418b has reverberation times far above the ANSI recommendation, whereas the room 4408 complies with ANSI S12.60 regardless of the source directivity. 3.5 2 3 1.8 1.6 T 20 (s) 2.5 2 T 20 (s) 1.4 1.2 1 1.5 0.8 0.6 1 dip 75 dip 60 dip 45 dip 30 dip 15 0.4 dip 75 dip 60 dip 45 dip 30 dip 15 Figure 4: Spatial-averaged T 20 in rooms 2418b (on the left) and 4408 (on the right): o 500 Hz, x 1 khz, + 2 khz, 4 khz, 8 khz 250 Hz, EDT (s) 2.8 2.6 2.4 2.2 2 1.8 1.6 1.4 dip 75 dip 60 dip 45 dip 30 dip 15 EDT (s) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 dip 75 dip 60 dip 45 dip 30 dip 15 Figure 5: Spatial-averaged EDT in rooms 2418b (on the left) and 4408 (on the right): o 500 Hz, x 1 khz, + 2 khz, 4 khz, 8 khz 250 Hz, Unlike T 20 and EDT, we observed that the acoustic descriptors related to the balance between early and late arriving energy were greatly influenced by the microphone position and the source directivity. This is illustrated in Figures 6 and 7, which present the definition for positions 8

P1 and P3 (see Figure 3), respectively, as a function of source directivity. The continuous lines indicate the minimum values, and the discrete shape-like curves correspond to the maximum values. It is worth mentioning that, for the 8-kHz octave band, the pre-programmed source modes lead to very intricate radiation patterns, so that it is difficult to interpret the results. For this reason, 8-kHz curves are not given here for the sake of clarity. 40 90 35 80 30 70 D50 (%) 25 20 D50 (%) 60 50 15 40 10 30 5 dip -75 dip -60 dip -45 dip -30 dip -15 20 dip -75 dip -60 dip -45 dip -30 dip -15 Figure 6: Definition at position P1 in rooms 2418b (on the left) and 4408 (on the right): o 500 Hz, x 1 khz, + 2 khz, 4 khz 250 Hz, 50 90 45 40 85 35 80 D50 (%) 30 25 20 D50 (%) 75 70 15 65 10 dip -75 dip -60 dip -45 dip -30 dip -15 60 dip -75 dip -60 dip -45 dip -30 dip -15 Figure 7: Definition at position P3 in rooms 2418b (on the left) and 4408 (on the right) : o 500 Hz, x 1 khz, + 2 khz, 4 khz 250 Hz, It is interesting to compare the results for the 12 dipole orientations: dip -75 to. The corresponds to a main radiation axis in the Y-axis (see Figure 3), and thus it passes through P1 of both rooms. Therefore, this dipole orientation leads approximately to a larger amount of direct energy than the other dipole orientations, which tends to improve the definition, as shown in Figure 6. Similarly, Figure 7 shows that the largest definition values occur around in room 2418b regardless of frequency, which corresponds approximately to a dipole 9

whose main radiation axis passes through P3. However, at P3 in room 4408, the obtained results cannot be explained in such a simple way because the overall behavior of the curves depends strongly on frequency, which is maybe due to the non-symmetrical distribution of the absorbing panels in the room. 5 Conclusion For the two classrooms investigated, we observed that, unlike T 20 and EDT, the definition is strongly affected by the source directivity and microphone position, regardless of frequency and acoustic conditioning. This shows that the directional characteristic of the sound source plays an important role in room acoustics, and thus it cannot be neglected when designing and/or assessing a closed space. Finally, it is worth emphasizing that the loudspeaker array used in this work is able to mimic the pole, the horizontal dipoles, and a lateral pole only in the low- and medium-frequency ranges, so that the results presented here must be carefully considered at high frequencies. Acknowledgments This research was supported by Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG, Brazil; project APQ-00648-12). References [1] Kuttruff, H. Room Acoustics. Spon Press, London (England), 5th edition, 2009. [2] Farina, A. Simultaneous measurement of impulse response and distortion with a swept-sine technique. Proceedings of the 108th Convention of the Audio Engineering Society, Paris, France, February 19-22, 2000, pp 1-23, paper number 5093. [3] Stan, G.-B.; Embrechts, J.-J.; Archambeau, D. Comparison of different impulse response measurement techniques. Journal of the Audio Engineering Society, Vol 50 (4), 2002, pp 249-262. [4] Müller, S.; Massarani, P. Transfer-function measurement with sweeps. Journal of the Audio Engineering Society, Vol 49 (6), 2001, pp 443-471. [5] Pasqual, A. M. Spherical harmonic analysis of the sound radiation from omnidirectional loudspeaker arrays. Journal of Sound and Vibration, Vol 333, 2014, pp 4930-4941. [6] Knüttel, T.; Witew, I. B.; Vorländer, M. Influence of omnidirectional loudspeaker directivity on measured room impulse responses. The Journal of the Acoustical Society of America, Vol 134 (5), 2013, pp 3654-3662. [7] International Organization for Standardization, ISO 3382: Acoustics Measurement of the reverberation time of rooms with reference to other acoustical parameters, Switzerland, 2nd revision, 1997. [8] Schroeder, M. R. New method of measuring reverberation time. The Journal of the Acoustical Society of America, Vol 37, 1965, pp 409-412. [9] American National Standards Institute and Acoustical Society of America, ANSI S12.60: Acoustical performance criteria, design requirements, and guidelines for schools Part 1: Permanent schools, New York, 2010. 10