PAPER Enhanced Vertical Perception through Head-Related Impulse Response Customization Based on Pinna Response Tuning in the Median Plane

Size: px
Start display at page:

Download "PAPER Enhanced Vertical Perception through Head-Related Impulse Response Customization Based on Pinna Response Tuning in the Median Plane"

Transcription

1 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY PAPER Enhanced Vertical Perception through Head-Related Impulse Response Customization Based on Pinna Response Tuning in the Median Plane Ki Hoon SHIN a), Nonmember and Youngjin PARK, Member SUMMARY Human s ability to perceive elevation of a sound and distinguish whether a sound is coming from the front or rear strongly depends on the monaural spectral features of the pinnae. In order to realize an effective virtual auditory display by HRTF (head-related transfer function) customization, the pinna responses were isolated from the median HRIRs (head-related impulse responses) of 45 individual HRIRs in the CIPIC HRTF database and modeled as linear combinations of 4 or 5 basic temporal shapes (basis functions) per each elevation on the median plane by PCA (principal components analysis) in the time domain. By tuning the weight of each basis function computed for a specific height to replace the pinna response in the KEMAR HRIR at the same height with the resulting customized pinna response and listening to the filtered stimuli over headphones, 4 individuals with normal hearing sensitivity were able to create a set of HRIRs that outperformed the KEMAR HRIRs in producing vertical effects with reduced front/back ambiguity in the median plane. Since the monaural spectral features of the pinnae are almost independent of azimuthal variation of the source direction, similar vertical effects could also be generated at different azimuthal directions simply by varying the ITD (interaural time difference) according to the direction as well as the size of each individual s own head. key words: HRTF customization, HRIR, pinna response tuning, principal components analysis 1. Introduction The ability of humans to use sonic cues to localize a sound in the surrounding 3 dimensional space is referred to as auditory localization. At its very core, lies the head-related transfer function (HRTF) which comprises major cues for spatial hearing such as the ITD (interaural time difference), ILD (interaural level difference), and spectral modification induced by the pinna folds. Synthesis of spatial hearing based on HRTFs is of great practical and research importance and non-individualized HRTFs measured with a dummy head microphone system (the KEMAR for instance) are used for most virtual audio syntheses. However, subjective evaluations on these non-individualized HRTFs involving a group of individuals often report front/back reversal and poor vertical effects. Both front/back distinction and vertical perception for humans are mainly triggered by the spectral features (peaks Manuscript received April 9, Manuscript revised June 29, The author is with Samsung Electronics, Suwon-City, , Republic of Korea. The author is with KAIST, Science Town, Daejeon, , Republic of Korea. a) kihoon221.shin@samsung.com DOI: /ietfec/e91 a and notches) produced by the direction-dependent filtering of the pinna as described by Shaw and Teranishi [1]. In particular, the importance of spectral notches (or nulls) as localization cues in the median plane (0 azimuth) is supported by Blauert [2] and also by Hebrank and Wright [3]. They concluded that elevation in the median plane where both ITD and ILD are zero is cued by a spectral notch whose frequency has similar dependence on elevation as that previously observed by Shaw and Teranishi in the lateral plane. Further results confirmed this conclusion both in the median plane [4] and in the lateral plane [5]. In an attempt to explain such a prominent feature in HRTFs, Lopez-Poveda and Meddis [6] suggested a diffraction/reflection model based on the posterior wall of the human concha and was able to predict the notch frequencies with reasonable accuracy. More recently, Langendijk and Bronkhorst [7] were able to isolate the frequency bands responsible for front/back and up/down cues in human HRTFs via a series of subjective listening tests. They concluded that front/back cues and up/down cues were located mainly in the 8 16-kHz band and in the 6 12-kHz band, respectively. Both bands lie in the spectral region of the pinna response which generally spans from 2 khz to above 14 khz [8]. Individual pinnae take a large variety of size and shape and the artificial set of pinnae mounted on the KEMAR are manufactured based on the average dimensions of human pinna cavities. Therefore, the pinna response of the non-individualized HRTF generally cannot match that of each individual HRTF resulting in front/back confusion and compromised vertical effects for most listeners. Based on the hypothesis that the structure of an HRTF is closely related to the dimensions and orientation of each individual body part, i.e. head, torso, shoulders, and pinnae, a variety of HRTF customization techniques by modifying other people s HRTFs has been introduced to accomplish perceptual fidelity in virtual audio synthesis. Some studies such as HRTF clustering and selection of a few most representative ones by Shimada et al. [9], a structural model for composition and decomposition of HRTFs by Algazi et al. [10], HRTF scaling in frequency by Middlebrooks [11], and database matching by Zotkin et al. [12] already suggested that the hypothesis is somewhat valid although a perfect localization (equivalent to the localization based on the listener s own HRTFs) was never closely achieved. For example, the work of Middlebrooks is based on the idea that Copyright c 2008 The Institute of Electronics, Information and Communication Engineers

2 346 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 the HRTF will be shifted toward the lower frequencies while maintaining its shape when the pinna is scaled up in size. If the listener deduces the source elevation from the positions of peaks and notches in the oncoming sound spectrum, localization with the scaled-up pinna larger than the listener s own pinna will result in systematic bias in elevation perception and personalization may be achieved simply by scaling down the HRTF of the scaled-up pinna. However, the pinnae of different individuals are different in many more aspects than just a simple scaling, and an insignificantly small change in the shape of the pinna can cause dramatic changes in the HRTF. The database matching technique suggested by Zotkin et al. [12] relies on the HRTF database released by the CIPIC Interface Laboratory at UC Davis containing 43 sets of individual HRTFs and 2 sets of KEMAR HRTFs along with some anthropometric information. By taking a picture of the listener s own ear and comparing the anthropometric parameters measured from the image to the ones provided in the database, they selected the best matching set of individual HRTFs for virtual auditory synthesis. Although the localization performance on source elevation was improved by 20 30% for 4 out of 6 subjects, this method requires a sophisticated imaging system that can capture the subject s ear to its real life size and automatically compute the anthropometric dimensions from the image. In 1984, Morimoto and Aokata [13] introduced the interaural-polar coordinate system and showed that the similar spectral cues observed in the median plane occur in any sagittal plane. Moreover, Wightman and Kistler [14] conducted a series of experiments in which the produced stimuli contained the ITD signaling one direction and ILD and pinna cues signaling another direction through manipulation of the ITD in the measured HRTFs of several individuals. The apparent lateral directions of such stimuli with conflicting cues almost always followed the ITD cue as long as the stimuli included low frequencies. Morimoto et al. [15] proposed a new sound localization method based on [13] that successfully rendered 3-d sound images in a sagittal plane by simulating interaural differences (ITD and ILD) and individual HRTFs measured in the median plane. They further showed that the ITD was dominant on lateral perception by performing localization tests in which either one of the ITD or ILD was manipulated separately while the other one was kept at zero. In this paper, a measurement-free and yet effective HRTF customization method that can be based on any individual HRTF database of substantial size is proposed. The goal of our study is not in the retrieval of exact individual HRTFs. Rather, our goal lies in the development of hybrid HRTFs that can deliver the necessary vertical perception better than the non-individual HRTFs while reducing front/back reversal for any particular listener. The basic idea is similar to that suggested in [15]. Vertical perception is controlled by modifying the pinna responses extracted from the median HRIRs in any individual HRTF database that does not contain the HRTF of the target subject, and lateral perception is controlled by introducing the head shadow effect to compensate for ILDs and proper ITDs that are represented as simple linear delays. Justification for approximating the HRTF phase as linear functions independent of frequency can be found in the work of Kulkarni et al. [16]. Our method is developed primarily in the time domain because structural decomposition of an HRTF is generally not easy in the frequency domain. An HRIR is a sequence of temporal events of sound waves reaching the ears over multiple paths. Therefore, the pinna response can be easily extracted from an HRIR simply by clipping away the shoulder/torso response and keeping only the early response since the pinna is located closest to the ear canal. Brown and Duda [17] argued that most pinna activity occurred in the first 0.7 ms since the arrival of the direct pulse by comparing the KE- MAR HRIRs measured with pinna to the HRIRs measured without pinna. However, a more detailed comparison of the data presented in their work reveals that the difference is not so prominent after the first 0.2 ms. Examination of the HRIRs from our HRTF database [18] and those from the CIPIC HRTF database [19] also indicates that most pinna activity with largest intersubject variation is concentrated in the first 0.2 ms, which corresponds to 10 samples at a khz sampling rate. The proposed HRTF customization procedure consists of the following steps (See Fig. 1). First, the temporal pinna responses, each containing exactly 10 samples from the beginning of the direct pulse, are extracted from a group of individual HRIRs measured in the median plane after all initial time delays are removed. Then, principal components analysis (PCA) is performed on the isolated pinna Fig. 1 Outline of procedures for the proposed HRTF customization method.

3 SHIN and PARK: ENHANCED VERTICAL PERCEPTION THROUGH HEAD-RELATED IMPULSE RESPONSE CUSTOMIZATION 347 responses at each selected elevation angle to model them as linear combinations of 4 or 5 basis functions (or principal components) by using the covariance method [20]. A graphical user interface (GUI) designed using MATLAB TM allows the subject to tune the pinna response by changing the weight on each basis function and listening to a broadband stimulus (100 Hz 20 khz) filtered with the resulting pinna response aligned with a shoulder/torso response extracted from the KEMAR HRIR at the same elevation angle over a set of headphones (Sennheiser HD 250 linear II). KEMAR s shoulder/torso response at each elevation angle can be obtained simply by clipping away the pinna response and linear delay from the corresponding KEMAR HRIR and this step is indicated by the dashed crosses shown in Fig. 1. Adjustment of the weight on each basis function can continue until a satisfactory elevation perception is achieved. The proposed HRIR customization procedure also includes the steps for introducing the head shadow effect and individualized ITDs to the customized pinna responses as shown in Fig. 1 for an accurate virtual auditory synthesis in the entire 3-d space around a target listener s head. However, it should be noted that these interaural differences were ignored in this study because we wanted to verify first the effectiveness of the proposed HRIR customization method in rendering enhanced elevation perception and reduced front/back confusion in the median plane only where all the interaural differences are zero. A total of 4 subjects with normal hearing sensitivity participated in this study. For performance comparison, the individual HRTFs of these 4 participants were measured in the median plane. Subjective listening tests were performed on the customized HRIRs, individual HRIRs, and the KEMAR HRIRs in order to verify feasibility of the proposed method. 2. Method 2.1 PCA of Pinna Responses in the Time Domain A typical HRIR can be decomposed into a series of tempo- ral sound events as shown in Fig. 2. There is first an initial time delay due to the distance of the source with respect to the ears. Then, a direct pulse whose amplitude depends on the source distance and shadow effect arrives, followed by a ridge-trough combination caused by reflection and diffraction due to pinna cavities. The rest of the signal contains reflections from shoulder, torso, and measurement devices such as the turntable and vertical hoop stand for holding the point source at desired angle. Technically, the direct pulse cannot be part of the pinna response, but the early response that lasts for about 0.2 ms since the arrival of the direct pulse is referred to as the pinna response throughout the rest of this paper for convenience. It should be noted that the individual HRIRs used in our analysis are the ones from the CIPIC HRTF database [19] containing HRTFs obtained from 43 individual subjects plus the KEMAR with 2 sets of pinnae of different size. The procedure of the covariance method [20] used for PCA is as follows. Let X be an M by N data matrix containing the extracted pinna responses at selected elevation angle where M is the number of total dimensions (10 in this case) and N is the number of available data sets (45 in this case). The empirical mean of X along each dimension m = 1,...,M can be computed from u[m] = 1 N N X[m, n]. (1) n=1 The empirical mean of the 45 individual pinna responses measured at 45 elevation is shown for both ears in Fig. 3 as an example. This mean vector u is then subtracted from each column of X to get a mean-subtracted data matrix B: B = X u h (2) where h is a 1 by N row vector of all 1 s. The M by M covariance matrix C is obtained from the outer product of B with itself: 1 C = E[B B] = N 1 B B (3) where * is the conjugate transpose operator. Next, the eigenvalue matrix D and the orthonormal eigenvector matrix V of Fig. 2 Structural decomposition of an HRIR measured with a B&K HATS (Head And Torso Simulator) with an acoustic point source located at 0 azimuth and 0 elevation [18]. Fig. 3 Empirical mean of 45 pinna responses per each ear collected from the CIPIC HRIRs measured at 45 elevation.

4 348 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 the covariance matrix C are computed satisfying the following relationship: C V = V D (4) where D is an M by M diagonal matrix with eigenvalues of C in the diagonal. Matrices V and D must be rearranged in order of decreasing eigenvalue. Now the eigenvalues represent the energy distribution of the data X among each of the eigenvectors that forms a basis for the data. The cumulative energy content g is the sum of the energy content across all of the eigenvectors from 1 through m: g[m] = m λ q (5) q=1 where λ q is the qth eigenvalue and m = 1,...,M. By choosing a suitable accuracy bound, which was set to be more than 90% of the total energy stored in the original data in our analysis, a subset of the eigenvectors are selected as basis vectors (principal components). The first L columns of V that satisfies the following accuracy bound on the cumulative energy ratio (CER) are chosen as the principal components (PCs): CER (%) = g[l] 100 > 90%. (6) g[m] The CER computed for the pinna responses at 45 elevation using the above equation with L = 1,...,10 is shown in Fig. 4. It can be seen that at least 5 PCs are required for the modeled data to represent more than 90% of the energy in the original data for both ears. So L = 5 in this case. These 5 PCs obtained for each ear are shown in Fig. 5. Note that the PCs obtained for the left ear pinna responses are almost identical to those obtained for the right ear pinna responses. This was generally the case for other sets of data at different elevation angles. Sometimes the required number of PCs was 4 depending on the elevation angle. Now let W be an M by L matrix with L PCs as its column vectors: for p = 1,...,M and q = 1,...,L. A new data matrix Y, which is a transformation of X onto the L principal components, can be obtained simply by Y = W B. (8) This new data matrix Y (an L by N matrix) can then be used to retrieve a truncated version of the original data X by X = W Y + u h. (9) In essence, a linear superposition of the L PCs in W with the nth column of Y as a set of L principal component weights (PCWs) approximately recovers the nth column of the orig- Fig. 5 Five basis functions (principal components: PC1 PC5) of the pinna responses at 45 elevation. The solid lines denote the left ear principal components and the dashed lines denote the right ear principal components. W[p, q] = V[p, q] (7) Fig. 4 Cumulative energy ratio (CER in Eq. (6)) plotted with increasing number of PCs for 45 elevation. The number of PCs on the horizontal axis represents L in Eq. (6). Fig. 6 Pinna responses at 45 elevation of subject 50 (solid) in the CIPIC HRTF database and their approximations (dashed) computed as a linear combination of the 5 PCs per ear shown in Fig. 5. Left ear responses are plotted in the upper panel and right ear responses in the lower panel.

5 SHIN and PARK: ENHANCED VERTICAL PERCEPTION THROUGH HEAD-RELATED IMPULSE RESPONSE CUSTOMIZATION 349 Fig. 7 Five sets of PCWs required in order to recover the original pinna responses in the CIPIC HRTF database as linear combinations of the five PCs (left ear) depicted in Fig. 5. Note that the distribution of the PCWs becomes smaller as the eigenvalue decreases. Fig. 9 Left ear pinna responses of subject 8 (solid), subject 60 (dashed), and subject 153 (dotted) from the CIPIC HRTF database at various elevation angles. The numbers in the right indicate the corresponding angles. Fig. 8 Left ear pinna responses at 45 elevation of 4 randomly selected subjects from the CIPIC HRTF database. inal data X. The left and right pinna responses at 45 elevation for subject 50 from the CIPIC HRTF database along with the approximations computed using Eq. (9) are plotted for comparison in Fig. 6. It can be seen that 5 PCs are enough to recover the original data with close resemblance. 45 sets of 5 PCWs for the left ear PCs shown in Fig. 5 that are required to model the entire 45 left ear pinna responses in the CIPIC HRTF database are captured in Fig. 7. Note that the spread of PCWs is the largest for PC 1 and smallest for PC 5. This is a direct consequence of rearranging V and D (Eq. (4)) in order of decreasing eigenvalue since larger eigenvalue implies bigger energy distribution of the original data along the corresponding eigenvector. In other words, the first 2 PCs are more important basis functions than the latter 3 PCs in representing the variation of the original data. The left ear pinna responses of 4 randomly selected individuals at 45 elevation depicted in Fig. 8 shows large intersubject variations around 0.08, 0.11, and 0.16 ms. One can easily observe from the left ear PCs in Fig. 5 that the first 3 PCs have ridges at the above temporal positions indicating that a linear combination of these first 3 PCs with appropriate PCWs can cover most intersubject variation in the shape and amplitude of the ridge-trough pair following the direct pulse. Amplitude variation of the direct pulse can be covered with PC 5 because it has a ridge in the region where the direct pulse is likely to reside. Therefore, by allowing a subject to tune the weight on each PC for customization, one is merely adding a timed ridge-trough pair with adjusted amplitude and an overall level shift to the mean pinna response in Fig. 3. The left ear pinna responses of 3 randomly selected individuals at elevations from 30 through 210 are plotted in Fig. 9 in order to observe the intersubject variation pattern per elevation angle in the median plane. The most common and salient change in the individual pinna responses as the source climbs in elevation lies in the arrival time and level of the first reflection (second ridge) immediately after the direct pulse (first ridge) and also in the shape and duration of the trough that follows. The temporal interval between the arrivals of the direct pulse and first reflection contracts as the source rises in the frontal hemisphere up to 60 where the two pulses merge into a single ridge. The two pulses stay merged for all rear source positions. Meanwhile, the width of the following trough decreases as the source rises to 90 which is directly over the head and increases back as the source descends in the rear hemisphere. The above

6 350 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 phenomenon is similar to that observed by Hiranaka and Yamasaki [21]. After examining many individual pinna responses in the CIPIC HRTF database, we could conclude that most intersubject variation in pinna responses lies in the amplitude and arrival times of either the direct pulse or ridge-trough pair depending on the elevation angle of the source. Note that these intersubject variations become quite small as the source moves into the rear hemisphere especially when the source lies directly behind the listener at 180.However,it can be shown that even a very small difference in the time domain yields a large difference in the frequency domain. 2.2 PCW Tuning for Customization As mentioned above, letting a subject tune the weight on each PC brings an actual change in the shape of the pinna response. Four male subjects with normal hearing sensitivity participated in making customized HRTFs by using a GUI (graphical user interface) depicted in Fig. 10. Sectors in the GUI are bound by boxes and labeled per function in the figure. A subject may choose any elevation angle from 45 to 230 in the median plane since the HRTFs from the CIPIC HRTF database are available in that angular range at intervals of However, customization was only carried out at 9 specific elevation angles from 30 to 210 at 30 intervals in the median plane in order to compare the localization performance of the customized HRTFs to that of individual HRTFs of the participants measured at those angles. Bal- ance control in the GUI adjusts gains to be applied to the left and right channels since it is necessary to render sound images in the center before the tuning commences and an interaural difference in perceived levels between the left and right ears is quite common even for individuals with normal hearing sensitivity. As mentioned in the previous section, the PCs obtained for left and right ears turned out to be similar to each other at most elevation angles despite the interaural shape difference in pinna responses for some individuals in the CIPIC HRTF database. As a result, ear symmetry was assumed and customization was performed by tuning the PCWs on only one ear. The slider on each slide-bar on the GUI represents the PCW values for each PC. After punching in an elevation angle at which customization is to be performed, principal components analysis is executed on the isolated pinna responses measured at the specified angle and corresponding PCs are computed by simply pushing the PCA button. Then, each participant fiddles with the slide-bars to adjust the PCW on each PC and listens to an input stimulus (100 Hz 20 khz) filtered by the newly created HRIR (marked as Custom HRIR in Fig. 10) by pushing the PLAY button. This Custom HRIR is formed by aligning the pinna response obtained as a linear combination of the tuned PCs to the shoulder/torso response of the KEMAR HRIR measured at the same angle. The PLAY KEMAR button is for listening to the same input stimulus filtered by the KEMAR HRIR. Some listeners may find the vertical perceptions produced by the KEMAR HRIRs good enough in which case they can tune the PCWs so Fig. 10 A MATLAB TM GUI for pinna response customization based on tuning of PCWs (See text for details).

7 SHIN and PARK: ENHANCED VERTICAL PERCEPTION THROUGH HEAD-RELATED IMPULSE RESPONSE CUSTOMIZATION 351 that the resulting pinna response shown as a solid line in the top-right panel on the GUI takes a similar shape with that of the KEMAR s shown as a dashed line in the same plot or simply keep the KEMAR HRIR as their customized HRIR at each angle of concern. On the other hand, if the KEMAR HRIR performs poorly in producing the necessary vertical effects, then the tuning can continue until each participant is satisfied with the resulting vertical effect he or she perceives. In our study, all participants reported unsatisfactory vertical perceptions with the KEMAR HRIRs so the tuning was performed on all target angles. Note that the headphone- pinna coupling effectforeachsubjectwascancelledusing thesubject s own headphone-to-meatus-entrance transfer function for all the output stimuli produced in the above tuning experiment. 2.3 Individual HRTF Measurement The individual HRTFs of the four subjects who participated in the above tuning experiment were measured at the elevation angles where the pinna customization took place. Subjects were seated in a chair coupled to a vertical hoop designed to hold an acoustic point source. Details on the measurement apparatus and method can be looked up in our previous work on modeling the HRTFs for nearby sources [18]. For correct headphone-presented simulation of freefield listening when evaluating these individual HRTFs on their localization capabilities, headphone-pinna coupling effect was cancelled using the headphone-to-meatus-entrance transfer function measured on each subject according to the method suggested by Wightman and Kistler [22]. While a typical HRTF measurement for an individual is carried out by placing a probe tube in the ear canal at a position very close to the eardrum, this is obviously a very difficult task. Møller, Sorensen, Hammershøi, and Jensen [23] demonstrated that HRTF measurements could also be made by measuring free-field and headphone responses at the entrance of a blocked ear canal. With their technique, however, a miniature microphone embedded in an earplug that can be fitted in each subject s ear canal is required. Instead of dealing with all the laborious procedures involved in the conventional measurement techniques, we adopted the Fig. 11 B&K Binaural Microphone Type 4101 (right) for measuring individual HRTFs mounted inside a subject s pinna at the entrance to the ear canal (left). blocked-meatus measurement technique using a B&K Binaural Microphone Type 4101 mounted inside each subject s pinna as shown in Fig. 11 for the sakes of convenience and efficiency. Although this stethoscope-like microphone set simplifies the overall measurement process by far, it was difficult to bend the microphone arms so that the microphone tips could be fitted with precision at the ear canal entrance without touching the tragus. Anchoring them in the exact same positions during measurement was another difficulty we faced. The microphone arms were taped on each subject s lower cheeks in an effort to anchor the microphone tips and the subjects were instructed to restrain from making any noticeable movement during the experiment. However, as the evaluation results shown in the subsequent chapter suggest, we believe that our individual HRTFs contain some errors induced by imprecise positioning of the microphone tips. 3. Subjective Evaluation Results Subjective listening tests were carried out on all four subjects (ID: SK, HS, KB, and CH) to assess the performance of the three HRIR sets: Customized HRIRs, individual HRIRs, and KEMAR HRIRs. In an attempt to prevent any possible learning acquired by the subject during the tuning process from affecting the overall evaluation result, the evaluation experiment was conducted several days after completion of tuning by all subjects. The subjects listened to broadband stimuli filtered by HRIRs from each of the above three HRIR sets over the headphones and gave their perceived responses by typing into a GUI designed for the evaluation test. Each of the 9 elevation angles is simulated 10 times in a random order yielding in total 90 stimuli to evaluate per HRIR set. The subjective evaluation results are shown in Figs for all 4 subjects. Evaluations on the KEMAR, individual, and customized HRIRs are displayed in the left, center, and right panel, respectively, in each figure. The horizontal axis denotes the actual source positions and the vertical axis denotes the perceived source positions in each panel. Note the response frequency scale drawn in a small box in the right panel of Fig. 15. The response frequency is represented by the size of the square with the largest square indicating 10 redundant responses and the smallest square indicating 1 response per each source location. The positive-sloped diagonal line in each panel indicates the perfect hearing condition in which the perceived source position corresponds exactly with the actual source position. The following observations are based on the evaluation responses presented in Figs All subjects reported difficulties of varying degree in making correct judgments on the source elevation on most trials with the KEMAR HRIRs. Either front/back reversal was frequent (especially for subjects SK and CH), which is evident from the many off-diagonal responses in symmetric positions with respect to the diagonal, or localization performance was low (for all 4 subjects) judging by the large response spread about

8 352 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 Fig. 12 Subjective evaluation result for subject SK on 3 HRIR sets: KEMAR, individual, and customized HRIRs (Refer to text for detail). Fig. 13 Subjective evaluation result for subject HS (Refer to text for detail). Fig. 14 Subjective evaluation result for subject KB (Refer to text for detail). Fig. 15 Subjective evaluation result for subject CH (Refer to text for detail). the diagonal. With individual HRIRs, front/back reversals were reduced for all subjects except for subject HS who often perceived the frontal sources at 30 and 0 to be in the rear instead. Subject KB made quite a few errors in

9 SHIN and PARK: ENHANCED VERTICAL PERCEPTION THROUGH HEAD-RELATED IMPULSE RESPONSE CUSTOMIZATION 353 Table 1 Localization errors s in Eq. (10) and front/back confusion counts computed by resolution of the responses shown in Figs The letters denote confusion clusters, i.e. C indicate the total confusions, B the backward confusions, and F the forward confusions. Fig. 16 Illustration for resolving front/back confusions. The confusions are reflected about the vertical plane (horizontal dashed line) onto the correct hemisphere. localizing the rear sources even with his own HRIRs and the scattered responses produced by subject CH for sources at 30,0, and 30 suggest that he too had difficulty in localizing the frontal sources near the horizontal plane. In general, however, it can be said that all subjects performed better with their own individual HRIRs than with the KE- MAR HRIRs judging by the tighter distribution of the responses around the diagonal. Comparison of the response data made with customized HRIRs to those made with the KEMAR HRIRs reveals the following. Front/back reversals were reduced for all subjects with customized HRIRs except for subject HS who made similar confusion errors for the sources on and below the horizontal plane as he did with his own HRIR set. The localization performance was enhanced for all subjects for most source positions judging by the smaller spread about the diagonal. Although subject HS localization performance with customized HRIRs was poor for sources near the horizontal plane, it was slightly improved for sources positioned at other elevation angles, i.e. from 0 to 150. Subject KB made poor elevation judgments with customized HRIRs as the source shifted from 90 to 210 into the rear hemisphere, but it should be noted that his localization performance on rear sources was poor with all 3 HRIR sets. When computing error indices to account for the localization performance associated with a particular set of HRIRs, it has been common practice to treat front/back confusions and localization accuracy separately by resolving the confusions in order to avoid error inflation [24]. On the other hand, resolution of the confusions can be misleading if we assume the responses correctly reflect the subject s perception. However, since our primary goal was to compare the three HRIR sets in terms of localization performance, we too elected to resolve all apparent confusions and report the incidence of confusions associated with each set of HRIRs. If the angle between the actual source position and the perceived response is made smaller by reflecting the response about the vertical plane passing through the subject s ears as shown in Fig. 16, the response is entered in reflected form and the confusion count is increased by one. Then, the localization error was computed in the root mean square sense including both the responses lying in the same hemisphere as the sources and the confusions in reflected form by the following definition 1 90 s = (x i φ source (i)) 2 90 i=1 1 2 (10) where x i is the perceived response for the ith stimulus corresponding to the actual source position φ source (i) andthe number 90 is the total number of presented stimuli per each HRIR set. Table 1 depicts these RMS errors and the confusion counts organized per subject per HRIR set evaluated. The RMS errors are indicated by the numbers in the top row of each cell and the confusion counts follow in the bottom row in the form: s/no. of total confusions (no. of backward confusions + no. of forward confusions). From these error indices shown in Table 1 we can deduce the following conclusions regarding the localization performance associated with each set of HRIRs. Comparison of the localization errors produced with the KEMAR HRIRs to those with the customized HRIRs reveals that the localization accuracy was improved by far with the customized HRIRs for subjects KB and CH whereas subjects SK and HS showed slightly better accuracy with the KE- MAR HRIRs. Obviously this is a direct result of resolution of the confusions because it appears to be otherwise for subjects SK and HS in Figs. 12 and 13. Of course, with the customized HRIRs front/back confusions were reduced for all subjects, and in particular, subjects SK and CH have shown dramatic improvements, i.e., the confusion counts wentfrom29to9forskandfrom43to6forch.on the contrary, the localization performance with individual HRIRs was not quite satisfactory for all subjects. Individual HRIRs are generally known to produce good localization results, but studies in the past like the one by Wightman and Kistler [24] show that headphone simulation of free-field lis-

10 354 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 Fig. 17 KEMAR (solid), individual (dashed), and customized (dotted) HRIRs (left) and the corresponding HRTFs (right) for subject SK. Fig. 18 KEMAR (solid), individual (dashed), and customized (dotted) HRIRs (left) and the corresponding HRTFs (right) for subject CH. tening tend to produce more frequent front/back confusions and less well defined source elevation as opposed to the freefield condition. With individual HRIRs, subjects HS and KB produced the best overall localization accuracy and subject KB s front/back confusions weretheleastof all three HRIR cases. On the other hand, the localization performance indices by the customized and individual HRIRs indicate that subjects SK and CH showed better localization accuracy and subjects HS and CH produced less confusions with the customized HRIRs than with individual HRIRs. In short, it can be said that with the customized HRIRs most subjects produced less confusions and 2 out of 4 subjects (SK and CH) performed best in the aspects of both the localization accuracy and front/back confusion.

11 SHIN and PARK: ENHANCED VERTICAL PERCEPTION THROUGH HEAD-RELATED IMPULSE RESPONSE CUSTOMIZATION 355 The customized and individual HRIRs for subjects SK and CH along with the KEMAR HRIRs and the corresponding HRTFs which are direct Fourier transforms obtained from the temporal responses are depicted in Figs. 17 and 18 for example. These plots immediately reveal that most spectral deviations among the HRTFs take place in the high frequency region and that the differences between the KEMAR and customized HRTFs mostly occur in the region above 6 khz, which is a direct consequence of the pinna response modification by tuning. It is also clear that even a small variation in the time response renders a substantial difference in the frequency response. In our study, we had hoped to find some similarity between the customized and individual responses both in the temporal and spectral shapes because in theory the two sets of responses are supposed to capture and reflect the individual pinna features better than the KEMAR HRIRs if the tuning had worked well as it did for these two subjects in particular. Unfortunately however, as was expected during the measurement phase of our study and also from the analysis of the evaluation results, there was very little similarity or none at all between the customized and individual HRTFs. The spectral notches and roll-offs that are known to be responsible for elevation perception do not seem to coincide even barely except for a few spectral regions, i.e. notches at 7 khz at 210, roll-offs at10khzat 150, notches at 16 khz at 120 and notches at 11.3 khz at 90 for subject SK in Fig. 17. Although the localization performances by subjects SK and CH using their own HRIRs were passable considering the results of headphone simulation of free-field condition achieved by others in the past, we believe that the individual HRIRs measured in this study contain errors probably induced by imprecise positioning of the microphone tips at the ear canal entrance as mentioned earlier. As a result, we cannot confirm if the spectral features in the HRTFs obtained by the proposed customization method indeed represent each individual s pinna characteristics at this point even though they have shown to bring improvements in the localization performance. 4. Discussion and Future Work The proposed HRIR customization method based on tuning of the basis functions obtained from decomposition of the pinna responses in the time domain by PCA was shown to be effectivein producing the necessaryvertical effects while reducing front/back reversals. We confirmed this by a series of subjective listening tests. With the customized HRIRs in comparison to the KEMAR HRIRs, 2 out of 4 subjects managed to show explicit improvements with noticeable decrease in front/back reversals while the other 2 subjects demonstrated enhanced elevation perception to some degree. All subjects reported that the sources at 60,90,and 120 in elevation angle were among the toughest to discriminate from one another for both individual and customized HRIRs and that they had to guess the source elevation on most trials with the KEMAR HRIRs. We also verified that similar vertical effects could also be generated at other azimuthal directions simply by adding proper ITDs to the customized HRIRs developed using the proposed method. The localization performance in other sagittal planes along with detailed analysis will follow in a subsequent paper. Acknowledgments This work was supported by the Korea Science and Engineering Foundation (KOSEF) through the National Research Laboratory Program (M J ) and the BK 21 Project (2006) of Republic of Korea. References [1] E.A.G. Shaw and R. Teranishi, Sound pressure generated in an external-ear replica and real human ears by a nearby point source, J. Acoust. Soc. Am., vol.44, pp , [2] J. Blauert, Sound localization in the median plane, Acoustica, vol.22, pp , 1969/1970. [3] J. Hebrank and D. Wright, Spectral cues used in the localization of sound sources on the median plane, J. Acoust. Soc. Am., vol.56, pp , [4] R.A. Butler and K. Belendiuk, Spectral cues utilized in the localization of sound in the median sagittal plane, J. Acoust. Soc. Am., vol.61, pp , [5] P.J. Bloom, Determination of monaural sensitivity changes due to the pinna by use of minimum-audible-field measurements in the lateral vertical plane, J. Acoust. Soc. Am., vol.61, pp , [6] E.A. Lopez-Poveda and R. Meddis, A physical model of sound diffraction and reflections in the human concha, J. Acoust. Soc. Am., vol.100, pp , [7] E.H.A. Langendijk and A.W. Bronkhorst, Contribution of spectral cues to human sound localization, J. Acoust. Soc. Am., vol.112, pp , [8] H.W. Gierlich, The application of binaural technology, Applied Acoustics, vol.36, pp , [9] S. Shimada, M. Hayashi, and S. Hayashi, A clustering method for sound localization transfer functions, J. Audio Eng. Soc., vol.42, pp , [10] V.R. Algazi, R.O. Duda, R.P. Morrison, and D.M. Thompson, Structural composition and decomposition of HRTFs, Proc. WAS- PAA01, pp , New Paltz, NY, [11] J.C. Middlebrooks, Virtual localization improved by scaling nonindividualized external-ear transfer functions in frequency, J. Acoust. Soc. Am., vol.106, pp , [12] D.N. Zotkin, R. Duraiswami, and L.S. Davis, Customizable auditory displays, Proc. Int. Conf. on Auditory Display (ICAD), pp , Kyoto, Japan, [13] M. Morimoto and H. Aokata, Localization cues of sound sources in the upper hemisphere, J. Acoust. Soc. Jpn. (E), vol.5, pp , [14] F.L. Wightman and D.J. Kistler, The dominant role of lowfrequency interaural time differences in sound localization, J. Acoust. Soc. Am., vol.91, pp , [15] M. Morimoto, M. Itoh, and K. Iida, 3-D sound image localization by interaural differences and the median plane HRTF, Proc Int. Conf. on Auditory Display (ICAD), Kyoto, Japan, July [16] A. Kulkarni, S.K. Isabelle, and H.S. Colburn, Sensitivity of human subjects to head-related transfer-function phase spectra, J. Acoust. Soc. Am., vol.105, pp , [17] C.P. Brown and R.O. Duda, A structural model for binaural sound synthesis, IEEE Trans. Speech Audio Process., vol.6, no.5, pp , [18] K. Shin and Y. Park, Modeling of non-individualized head-related transfer functions for nearby sources, Proc. 9th Western Pacific

12 356 IEICE TRANS. FUNDAMENTALS, VOL.E91 A, NO.1 JANUARY 2008 Acoustics Conf. (WESPAC), pp , Seoul, Korea, June [19] CIPIC HRTF Database Files, Release 1.1, August 2001, CIPIC Interface Laboratory, U.C. Davis, available from ucdavis.edu/ [20] J.E. Jackson, A User s Guide to Principal Components, pp.1 25, John Wiley & Sons, [21] Y. Hiranaka and H. Yamasaki, Envelope representation of pinna impulse responses relating to three-dimensional localization of sound sources, J. Acoust. Soc. Am., vol.73, pp , [22] F.L. Wightman and D.J. Kistler, Headphone simulation of freefield listening. I: Stimulus synthesis, J. Acoust. Soc. Am., vol.85, pp , [23] H. Møller, M.F. Sorensen, D. Hammershøi, and C.B. Jensen, Headrelated transfer functions of human subjects, J. Audio Eng. Soc., vol.43, pp , [24] F.L. Wightman and D.J. Kistler, Headphone simulation of freefield listening. II: Psychophysical validation, J. Acoust. Soc. Am., vol.85, pp , Ki Hoon Shin was born in Seoul, Korea in He received his B.S. and M.S. degrees in mechanical engineering from University of Rochester, NY, in 1996 and 1998, respectively. From 1998 to 2000, he was enrolled in a Ph.D. program in aerospace engineering at Georgia Tech, GA. Since 2001, he engaged in researches on virtual audio synthesis for a Ph.D. in mechanical engineering at Korea Advanced Institute of Science and Technology (KAIST). He is now at the Digital Media R&D Center of Samsung Electronics developing audio algorithms for DTVs and home theatres. Youngjin Park was born in Seoul, Korea in He received his B.S. and M.S. degrees in mechanical engineering from Seoul National University in 1980 and 1982, respectively, and the Ph.D. in mechanical engineering from University of Michigan, MI, in From 1987 to 1988, he worked as a research fellow at University of Michigan. He also worked as an assistant professor at NJIT, NJ, from 1988 to He joined the faculty of Korea Advanced Institute of Science and Technology (KAIST) in 1990, where he is a Professor of Mechanical Engineering. His research interests include general control theories, virtual audio synthesis, active control of noise and vibration, system identification, etc.

HRIR Customization in the Median Plane via Principal Components Analysis

HRIR Customization in the Median Plane via Principal Components Analysis 한국소음진동공학회 27 년춘계학술대회논문집 KSNVE7S-6- HRIR Customization in the Median Plane via Principal Components Analysis 주성분분석을이용한 HRIR 맞춤기법 Sungmok Hwang and Youngjin Park* 황성목 박영진 Key Words : Head-Related Transfer

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 A MODEL OF THE HEAD-RELATED TRANSFER FUNCTION BASED ON SPECTRAL CUES

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 A MODEL OF THE HEAD-RELATED TRANSFER FUNCTION BASED ON SPECTRAL CUES 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 007 A MODEL OF THE HEAD-RELATED TRANSFER FUNCTION BASED ON SPECTRAL CUES PACS: 43.66.Qp, 43.66.Pn, 43.66Ba Iida, Kazuhiro 1 ; Itoh, Motokuni

More information

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Spatial Audio Reproduction: Towards Individualized Binaural Sound Spatial Audio Reproduction: Towards Individualized Binaural Sound WILLIAM G. GARDNER Wave Arts, Inc. Arlington, Massachusetts INTRODUCTION The compact disc (CD) format records audio with 16-bit resolution

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 1, 21 http://acousticalsociety.org/ ICA 21 Montreal Montreal, Canada 2 - June 21 Psychological and Physiological Acoustics Session appb: Binaural Hearing (Poster

More information

Audio Engineering Society. Convention Paper. Presented at the 131st Convention 2011 October New York, NY, USA

Audio Engineering Society. Convention Paper. Presented at the 131st Convention 2011 October New York, NY, USA Audio Engineering Society Convention Paper Presented at the 131st Convention 2011 October 20 23 New York, NY, USA This Convention paper was selected based on a submitted abstract and 750-word precis that

More information

Enhancing 3D Audio Using Blind Bandwidth Extension

Enhancing 3D Audio Using Blind Bandwidth Extension Enhancing 3D Audio Using Blind Bandwidth Extension (PREPRINT) Tim Habigt, Marko Ðurković, Martin Rothbucher, and Klaus Diepold Institute for Data Processing, Technische Universität München, 829 München,

More information

Sound Source Localization using HRTF database

Sound Source Localization using HRTF database ICCAS June -, KINTEX, Gyeonggi-Do, Korea Sound Source Localization using HRTF database Sungmok Hwang*, Youngjin Park and Younsik Park * Center for Noise and Vibration Control, Dept. of Mech. Eng., KAIST,

More information

3D sound image control by individualized parametric head-related transfer functions

3D sound image control by individualized parametric head-related transfer functions D sound image control by individualized parametric head-related transfer functions Kazuhiro IIDA 1 and Yohji ISHII 1 Chiba Institute of Technology 2-17-1 Tsudanuma, Narashino, Chiba 275-001 JAPAN ABSTRACT

More information

Acoustics Research Institute

Acoustics Research Institute Austrian Academy of Sciences Acoustics Research Institute Spatial SpatialHearing: Hearing: Single SingleSound SoundSource Sourcein infree FreeField Field Piotr PiotrMajdak Majdak&&Bernhard BernhardLaback

More information

Upper hemisphere sound localization using head-related transfer functions in the median plane and interaural differences

Upper hemisphere sound localization using head-related transfer functions in the median plane and interaural differences Acoust. Sci. & Tech. 24, 5 (23) PAPER Upper hemisphere sound localization using head-related transfer functions in the median plane and interaural differences Masayuki Morimoto 1;, Kazuhiro Iida 2;y and

More information

ANALYZING NOTCH PATTERNS OF HEAD RELATED TRANSFER FUNCTIONS IN CIPIC AND SYMARE DATABASES. M. Shahnawaz, L. Bianchi, A. Sarti, S.

ANALYZING NOTCH PATTERNS OF HEAD RELATED TRANSFER FUNCTIONS IN CIPIC AND SYMARE DATABASES. M. Shahnawaz, L. Bianchi, A. Sarti, S. ANALYZING NOTCH PATTERNS OF HEAD RELATED TRANSFER FUNCTIONS IN CIPIC AND SYMARE DATABASES M. Shahnawaz, L. Bianchi, A. Sarti, S. Tubaro Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico

More information

A triangulation method for determining the perceptual center of the head for auditory stimuli

A triangulation method for determining the perceptual center of the head for auditory stimuli A triangulation method for determining the perceptual center of the head for auditory stimuli PACS REFERENCE: 43.66.Qp Brungart, Douglas 1 ; Neelon, Michael 2 ; Kordik, Alexander 3 ; Simpson, Brian 4 1

More information

Computational Perception. Sound localization 2

Computational Perception. Sound localization 2 Computational Perception 15-485/785 January 22, 2008 Sound localization 2 Last lecture sound propagation: reflection, diffraction, shadowing sound intensity (db) defining computational problems sound lateralization

More information

Convention Paper 9870 Presented at the 143 rd Convention 2017 October 18 21, New York, NY, USA

Convention Paper 9870 Presented at the 143 rd Convention 2017 October 18 21, New York, NY, USA Audio Engineering Society Convention Paper 987 Presented at the 143 rd Convention 217 October 18 21, New York, NY, USA This convention paper was selected based on a submitted abstract and 7-word precis

More information

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF F. Rund, D. Štorek, O. Glaser, M. Barda Faculty of Electrical Engineering Czech Technical University in Prague, Prague, Czech Republic

More information

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations György Wersényi Széchenyi István University, Hungary. József Répás Széchenyi István University, Hungary. Summary

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Psychological and Physiological Acoustics Session 2aPPa: Binaural Hearing

More information

Analysis of Frontal Localization in Double Layered Loudspeaker Array System

Analysis of Frontal Localization in Double Layered Loudspeaker Array System Proceedings of 20th International Congress on Acoustics, ICA 2010 23 27 August 2010, Sydney, Australia Analysis of Frontal Localization in Double Layered Loudspeaker Array System Hyunjoo Chung (1), Sang

More information

Extracting the frequencies of the pinna spectral notches in measured head related impulse responses

Extracting the frequencies of the pinna spectral notches in measured head related impulse responses Extracting the frequencies of the pinna spectral notches in measured head related impulse responses Vikas C. Raykar a and Ramani Duraiswami b Perceptual Interfaces and Reality Laboratory, Institute for

More information

Binaural Hearing. Reading: Yost Ch. 12

Binaural Hearing. Reading: Yost Ch. 12 Binaural Hearing Reading: Yost Ch. 12 Binaural Advantages Sounds in our environment are usually complex, and occur either simultaneously or close together in time. Studies have shown that the ability to

More information

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno JAIST Reposi https://dspace.j Title Study on method of estimating direct arrival using monaural modulation sp Author(s)Ando, Masaru; Morikawa, Daisuke; Uno Citation Journal of Signal Processing, 18(4):

More information

On distance dependence of pinna spectral patterns in head-related transfer functions

On distance dependence of pinna spectral patterns in head-related transfer functions On distance dependence of pinna spectral patterns in head-related transfer functions Simone Spagnol a) Department of Information Engineering, University of Padova, Padova 35131, Italy spagnols@dei.unipd.it

More information

HRTF adaptation and pattern learning

HRTF adaptation and pattern learning HRTF adaptation and pattern learning FLORIAN KLEIN * AND STEPHAN WERNER Electronic Media Technology Lab, Institute for Media Technology, Technische Universität Ilmenau, D-98693 Ilmenau, Germany The human

More information

Auditory Localization

Auditory Localization Auditory Localization CMPT 468: Sound Localization Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University November 15, 2013 Auditory locatlization is the human perception

More information

Convention Paper 9712 Presented at the 142 nd Convention 2017 May 20 23, Berlin, Germany

Convention Paper 9712 Presented at the 142 nd Convention 2017 May 20 23, Berlin, Germany Audio Engineering Society Convention Paper 9712 Presented at the 142 nd Convention 2017 May 20 23, Berlin, Germany This convention paper was selected based on a submitted abstract and 750-word precis that

More information

Modeling Head-Related Transfer Functions Based on Pinna Anthropometry

Modeling Head-Related Transfer Functions Based on Pinna Anthropometry Second LACCEI International Latin American and Caribbean Conference for Engineering and Technology (LACCEI 24) Challenges and Opportunities for Engineering Education, Research and Development 2-4 June

More information

PERSONALIZED HEAD RELATED TRANSFER FUNCTION MEASUREMENT AND VERIFICATION THROUGH SOUND LOCALIZATION RESOLUTION

PERSONALIZED HEAD RELATED TRANSFER FUNCTION MEASUREMENT AND VERIFICATION THROUGH SOUND LOCALIZATION RESOLUTION PERSONALIZED HEAD RELATED TRANSFER FUNCTION MEASUREMENT AND VERIFICATION THROUGH SOUND LOCALIZATION RESOLUTION Michał Pec, Michał Bujacz, Paweł Strumiłło Institute of Electronics, Technical University

More information

Introduction. 1.1 Surround sound

Introduction. 1.1 Surround sound Introduction 1 This chapter introduces the project. First a brief description of surround sound is presented. A problem statement is defined which leads to the goal of the project. Finally the scope of

More information

Psychoacoustic Cues in Room Size Perception

Psychoacoustic Cues in Room Size Perception Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany 6084 This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

Personalization of head-related transfer functions in the median plane based on the anthropometry of the listener s pinnae a)

Personalization of head-related transfer functions in the median plane based on the anthropometry of the listener s pinnae a) Personalization of head-related transfer functions in the median plane based on the anthropometry of the listener s pinnae a) Kazuhiro Iida, b) Yohji Ishii, and Shinsuke Nishioka Faculty of Engineering,

More information

Spatial Audio & The Vestibular System!

Spatial Audio & The Vestibular System! ! Spatial Audio & The Vestibular System! Gordon Wetzstein! Stanford University! EE 267 Virtual Reality! Lecture 13! stanford.edu/class/ee267/!! Updates! lab this Friday will be released as a video! TAs

More information

Convention Paper Presented at the 139th Convention 2015 October 29 November 1 New York, USA

Convention Paper Presented at the 139th Convention 2015 October 29 November 1 New York, USA Audio Engineering Society Convention Paper Presented at the 139th Convention 2015 October 29 November 1 New York, USA 9447 This Convention paper was selected based on a submitted abstract and 750-word

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Psychological and Physiological Acoustics Session 3pPP: Multimodal Influences

More information

The analysis of multi-channel sound reproduction algorithms using HRTF data

The analysis of multi-channel sound reproduction algorithms using HRTF data The analysis of multichannel sound reproduction algorithms using HRTF data B. Wiggins, I. PatersonStephens, P. Schillebeeckx Processing Applications Research Group University of Derby Derby, United Kingdom

More information

Spatial audio is a field that

Spatial audio is a field that [applications CORNER] Ville Pulkki and Matti Karjalainen Multichannel Audio Rendering Using Amplitude Panning Spatial audio is a field that investigates techniques to reproduce spatial attributes of sound

More information

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4 SOPA version 2 Revised July 7 2014 SOPA project September 21, 2014 Contents 1 Introduction 2 2 Basic concept 3 3 Capturing spatial audio 4 4 Sphere around your head 5 5 Reproduction 7 5.1 Binaural reproduction......................

More information

Ivan Tashev Microsoft Research

Ivan Tashev Microsoft Research Hannes Gamper Microsoft Research David Johnston Microsoft Research Ivan Tashev Microsoft Research Mark R. P. Thomas Dolby Laboratories Jens Ahrens Chalmers University, Sweden Augmented and virtual reality,

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution

More information

III. Publication III. c 2005 Toni Hirvonen.

III. Publication III. c 2005 Toni Hirvonen. III Publication III Hirvonen, T., Segregation of Two Simultaneously Arriving Narrowband Noise Signals as a Function of Spatial and Frequency Separation, in Proceedings of th International Conference on

More information

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL

A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL 9th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, -7 SEPTEMBER 7 A CLOSER LOOK AT THE REPRESENTATION OF INTERAURAL DIFFERENCES IN A BINAURAL MODEL PACS: PACS:. Pn Nicolas Le Goff ; Armin Kohlrausch ; Jeroen

More information

Sound Source Localization in Median Plane using Artificial Ear

Sound Source Localization in Median Plane using Artificial Ear International Conference on Control, Automation and Systems 28 Oct. 14-17, 28 in COEX, Seoul, Korea Sound Source Localization in Median Plane using Artificial Ear Sangmoon Lee 1, Sungmok Hwang 2, Youngjin

More information

Externalization in binaural synthesis: effects of recording environment and measurement procedure

Externalization in binaural synthesis: effects of recording environment and measurement procedure Externalization in binaural synthesis: effects of recording environment and measurement procedure F. Völk, F. Heinemann and H. Fastl AG Technische Akustik, MMK, TU München, Arcisstr., 80 München, Germany

More information

6-channel recording/reproduction system for 3-dimensional auralization of sound fields

6-channel recording/reproduction system for 3-dimensional auralization of sound fields Acoust. Sci. & Tech. 23, 2 (2002) TECHNICAL REPORT 6-channel recording/reproduction system for 3-dimensional auralization of sound fields Sakae Yokoyama 1;*, Kanako Ueno 2;{, Shinichi Sakamoto 2;{ and

More information

Creating three dimensions in virtual auditory displays *

Creating three dimensions in virtual auditory displays * Salvendy, D Harris, & RJ Koubek (eds.), (Proc HCI International 2, New Orleans, 5- August), NJ: Erlbaum, 64-68. Creating three dimensions in virtual auditory displays * Barbara Shinn-Cunningham Boston

More information

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011 396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011 Obtaining Binaural Room Impulse Responses From B-Format Impulse Responses Using Frequency-Dependent Coherence

More information

Computational Perception /785

Computational Perception /785 Computational Perception 15-485/785 Assignment 1 Sound Localization due: Thursday, Jan. 31 Introduction This assignment focuses on sound localization. You will develop Matlab programs that synthesize sounds

More information

Dataset of head-related transfer functions measured with a circular loudspeaker array

Dataset of head-related transfer functions measured with a circular loudspeaker array Acoust. Sci. & Tech. 35, 3 (214) TECHNICAL REPORT #214 The Acoustical Society of Japan Dataset of head-related transfer functions measured with a circular loudspeaker array Kanji Watanabe 1;, Yukio Iwaya

More information

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 26, NO. 7, JULY

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 26, NO. 7, JULY IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 26, NO. 7, JULY 2018 1243 Do We Need Individual Head-Related Transfer Functions for Vertical Localization? The Case Study of a Spectral

More information

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists 3,700 108,500 1.7 M Open access books available International authors and editors Downloads Our

More information

Sound localization with multi-loudspeakers by usage of a coincident microphone array

Sound localization with multi-loudspeakers by usage of a coincident microphone array PAPER Sound localization with multi-loudspeakers by usage of a coincident microphone array Jun Aoki, Haruhide Hokari and Shoji Shimada Nagaoka University of Technology, 1603 1, Kamitomioka-machi, Nagaoka,

More information

Sound source localization and its use in multimedia applications

Sound source localization and its use in multimedia applications Notes for lecture/ Zack Settel, McGill University Sound source localization and its use in multimedia applications Introduction With the arrival of real-time binaural or "3D" digital audio processing,

More information

Discrimination of Virtual Haptic Textures Rendered with Different Update Rates

Discrimination of Virtual Haptic Textures Rendered with Different Update Rates Discrimination of Virtual Haptic Textures Rendered with Different Update Rates Seungmoon Choi and Hong Z. Tan Haptic Interface Research Laboratory Purdue University 465 Northwestern Avenue West Lafayette,

More information

Circumaural transducer arrays for binaural synthesis

Circumaural transducer arrays for binaural synthesis Circumaural transducer arrays for binaural synthesis R. Greff a and B. F G Katz b a A-Volute, 4120 route de Tournai, 59500 Douai, France b LIMSI-CNRS, B.P. 133, 91403 Orsay, France raphael.greff@a-volute.com

More information

Listening with Headphones

Listening with Headphones Listening with Headphones Main Types of Errors Front-back reversals Angle error Some Experimental Results Most front-back errors are front-to-back Substantial individual differences Most evident in elevation

More information

Structural Modeling Of Pinna-Related Transfer Functions

Structural Modeling Of Pinna-Related Transfer Functions Structural Modeling Of Pinna-Related Transfer Functions Simone Spagnol spagnols@dei.unipd.it Michele Geronazzo Università di Padova geronazz@dei.unipd.it Federico Avanzini avanzini@dei.unipd.it ABSTRACT

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ IA 213 Montreal Montreal, anada 2-7 June 213 Psychological and Physiological Acoustics Session 3pPP: Multimodal Influences

More information

MANY emerging applications require the ability to render

MANY emerging applications require the ability to render IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 4, AUGUST 2004 553 Rendering Localized Spatial Audio in a Virtual Auditory Space Dmitry N. Zotkin, Ramani Duraiswami, Member, IEEE, and Larry S. Davis, Fellow,

More information

THE TEMPORAL and spectral structure of a sound signal

THE TEMPORAL and spectral structure of a sound signal IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 13, NO. 1, JANUARY 2005 105 Localization of Virtual Sources in Multichannel Audio Reproduction Ville Pulkki and Toni Hirvonen Abstract The localization

More information

Convention Paper Presented at the 144 th Convention 2018 May 23 26, Milan, Italy

Convention Paper Presented at the 144 th Convention 2018 May 23 26, Milan, Italy Audio Engineering Society Convention Paper Presented at the 144 th Convention 2018 May 23 26, Milan, Italy This paper was peer-reviewed as a complete manuscript for presentation at this convention. This

More information

BINAURAL RECORDING SYSTEM AND SOUND MAP OF MALAGA

BINAURAL RECORDING SYSTEM AND SOUND MAP OF MALAGA EUROPEAN SYMPOSIUM ON UNDERWATER BINAURAL RECORDING SYSTEM AND SOUND MAP OF MALAGA PACS: Rosas Pérez, Carmen; Luna Ramírez, Salvador Universidad de Málaga Campus de Teatinos, 29071 Málaga, España Tel:+34

More information

Tara J. Martin Boston University Hearing Research Center, 677 Beacon Street, Boston, Massachusetts 02215

Tara J. Martin Boston University Hearing Research Center, 677 Beacon Street, Boston, Massachusetts 02215 Localizing nearby sound sources in a classroom: Binaural room impulse responses a) Barbara G. Shinn-Cunningham b) Boston University Hearing Research Center and Departments of Cognitive and Neural Systems

More information

3D Sound Simulation over Headphones

3D Sound Simulation over Headphones Lorenzo Picinali (lorenzo@limsi.fr or lpicinali@dmu.ac.uk) Paris, 30 th September, 2008 Chapter for the Handbook of Research on Computational Art and Creative Informatics Chapter title: 3D Sound Simulation

More information

THE INTERACTION BETWEEN HEAD-TRACKER LATENCY, SOURCE DURATION, AND RESPONSE TIME IN THE LOCALIZATION OF VIRTUAL SOUND SOURCES

THE INTERACTION BETWEEN HEAD-TRACKER LATENCY, SOURCE DURATION, AND RESPONSE TIME IN THE LOCALIZATION OF VIRTUAL SOUND SOURCES THE INTERACTION BETWEEN HEAD-TRACKER LATENCY, SOURCE DURATION, AND RESPONSE TIME IN THE LOCALIZATION OF VIRTUAL SOUND SOURCES Douglas S. Brungart Brian D. Simpson Richard L. McKinley Air Force Research

More information

NEAR-FIELD VIRTUAL AUDIO DISPLAYS

NEAR-FIELD VIRTUAL AUDIO DISPLAYS NEAR-FIELD VIRTUAL AUDIO DISPLAYS Douglas S. Brungart Human Effectiveness Directorate Air Force Research Laboratory Wright-Patterson AFB, Ohio Abstract Although virtual audio displays are capable of realistically

More information

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Downloaded from orbit.dtu.dk on: Feb 05, 2018 The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation Käsbach, Johannes;

More information

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O. Tone-in-noise detection: Observed discrepancies in spectral integration Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O. Box 513, NL-5600 MB Eindhoven, The Netherlands Armin Kohlrausch b) and

More information

Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction.

Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction. Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction Eiichi Miyasaka 1 1 Introduction Large-screen HDTV sets with the screen sizes over

More information

Paper Body Vibration Effects on Perceived Reality with Multi-modal Contents

Paper Body Vibration Effects on Perceived Reality with Multi-modal Contents ITE Trans. on MTA Vol. 2, No. 1, pp. 46-5 (214) Copyright 214 by ITE Transactions on Media Technology and Applications (MTA) Paper Body Vibration Effects on Perceived Reality with Multi-modal Contents

More information

University of Huddersfield Repository

University of Huddersfield Repository University of Huddersfield Repository Moore, David J. and Wakefield, Jonathan P. Surround Sound for Large Audiences: What are the Problems? Original Citation Moore, David J. and Wakefield, Jonathan P.

More information

THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS

THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS THE DEVELOPMENT OF A DESIGN TOOL FOR 5-SPEAKER SURROUND SOUND DECODERS by John David Moore A thesis submitted to the University of Huddersfield in partial fulfilment of the requirements for the degree

More information

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES ROOM AND CONCERT HALL ACOUSTICS The perception of sound by human listeners in a listening space, such as a room or a concert hall is a complicated function of the type of source sound (speech, oration,

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST PACS: 43.25.Lj M.Jones, S.J.Elliott, T.Takeuchi, J.Beer Institute of Sound and Vibration Research;

More information

HRTF measurement on KEMAR manikin

HRTF measurement on KEMAR manikin Proceedings of ACOUSTICS 29 23 25 November 29, Adelaide, Australia HRTF measurement on KEMAR manikin Mengqiu Zhang, Wen Zhang, Rodney A. Kennedy, and Thushara D. Abhayapala ABSTRACT Applied Signal Processing

More information

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work Audio Engineering Society Convention Paper Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

Intensity Discrimination and Binaural Interaction

Intensity Discrimination and Binaural Interaction Technical University of Denmark Intensity Discrimination and Binaural Interaction 2 nd semester project DTU Electrical Engineering Acoustic Technology Spring semester 2008 Group 5 Troels Schmidt Lindgreen

More information

ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES

ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES Abstract ANALYSIS AND EVALUATION OF IRREGULARITY IN PITCH VIBRATO FOR STRING-INSTRUMENT TONES William L. Martens Faculty of Architecture, Design and Planning University of Sydney, Sydney NSW 2006, Australia

More information

Interference in stimuli employed to assess masking by substitution. Bernt Christian Skottun. Ullevaalsalleen 4C Oslo. Norway

Interference in stimuli employed to assess masking by substitution. Bernt Christian Skottun. Ullevaalsalleen 4C Oslo. Norway Interference in stimuli employed to assess masking by substitution Bernt Christian Skottun Ullevaalsalleen 4C 0852 Oslo Norway Short heading: Interference ABSTRACT Enns and Di Lollo (1997, Psychological

More information

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-), Verona, Italy, December 7-9,2 AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES Tapio Lokki Telecommunications

More information

Aalborg Universitet. Binaural Technique Hammershøi, Dorte; Møller, Henrik. Published in: Communication Acoustics. Publication date: 2005

Aalborg Universitet. Binaural Technique Hammershøi, Dorte; Møller, Henrik. Published in: Communication Acoustics. Publication date: 2005 Aalborg Universitet Binaural Technique Hammershøi, Dorte; Møller, Henrik Published in: Communication Acoustics Publication date: 25 Link to publication from Aalborg University Citation for published version

More information

Simulation of wave field synthesis

Simulation of wave field synthesis Simulation of wave field synthesis F. Völk, J. Konradl and H. Fastl AG Technische Akustik, MMK, TU München, Arcisstr. 21, 80333 München, Germany florian.voelk@mytum.de 1165 Wave field synthesis utilizes

More information

Convention e-brief 433

Convention e-brief 433 Audio Engineering Society Convention e-brief 433 Presented at the 144 th Convention 2018 May 23 26, Milan, Italy This Engineering Brief was selected on the basis of a submitted synopsis. The author is

More information

Validation of lateral fraction results in room acoustic measurements

Validation of lateral fraction results in room acoustic measurements Validation of lateral fraction results in room acoustic measurements Daniel PROTHEROE 1 ; Christopher DAY 2 1, 2 Marshall Day Acoustics, New Zealand ABSTRACT The early lateral energy fraction (LF) is one

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Architectural Acoustics Session 1pAAa: Advanced Analysis of Room Acoustics:

More information

A binaural auditory model and applications to spatial sound evaluation

A binaural auditory model and applications to spatial sound evaluation A binaural auditory model and applications to spatial sound evaluation Ma r k o Ta k a n e n 1, Ga ë ta n Lo r h o 2, a n d Mat t i Ka r ja l a i n e n 1 1 Helsinki University of Technology, Dept. of Signal

More information

Aalborg Universitet. Audibility of time switching in dynamic binaural synthesis Hoffmann, Pablo Francisco F.; Møller, Henrik

Aalborg Universitet. Audibility of time switching in dynamic binaural synthesis Hoffmann, Pablo Francisco F.; Møller, Henrik Aalborg Universitet Audibility of time switching in dynamic binaural synthesis Hoffmann, Pablo Francisco F.; Møller, Henrik Published in: Journal of the Audio Engineering Society Publication date: 2005

More information

Distortion products and the perceived pitch of harmonic complex tones

Distortion products and the perceived pitch of harmonic complex tones Distortion products and the perceived pitch of harmonic complex tones D. Pressnitzer and R.D. Patterson Centre for the Neural Basis of Hearing, Dept. of Physiology, Downing street, Cambridge CB2 3EG, U.K.

More information

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN WAVELET-BASE SPECTRAL SMOOTHING FOR HEA-RELATE TRANSFER FUNCTION FILTER ESIGN HUSEYIN HACIHABIBOGLU, BANU GUNEL, AN FIONN MURTAGH Sonic Arts Research Centre (SARC), Queen s University Belfast, Belfast,

More information

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting

TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones and Source Counting TDE-ILD-HRTF-Based 2D Whole-Plane Sound Source Localization Using Only Two Microphones Source Counting Ali Pourmohammad, Member, IACSIT Seyed Mohammad Ahadi Abstract In outdoor cases, TDOA-based methods

More information

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction S.B. Nielsen a and A. Celestinos b a Aalborg University, Fredrik Bajers Vej 7 B, 9220 Aalborg Ø, Denmark

More information

On the Estimation of Interleaved Pulse Train Phases

On the Estimation of Interleaved Pulse Train Phases 3420 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 48, NO. 12, DECEMBER 2000 On the Estimation of Interleaved Pulse Train Phases Tanya L. Conroy and John B. Moore, Fellow, IEEE Abstract Some signals are

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 2013 http://acousticalsociety.org/ ICA 2013 Montreal Montreal, Canada 2-7 June 2013 Engineering Acoustics Session 2pEAb: Controlling Sound Quality 2pEAb10.

More information

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois. UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab 3D and Virtual Sound Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Overview Human perception of sound and space ITD, IID,

More information

Evaluating HRTF Similarity through Subjective Assessments: Factors that can Affect Judgment

Evaluating HRTF Similarity through Subjective Assessments: Factors that can Affect Judgment Evaluating HRTF Similarity through Subjective Assessments: Factors that can Affect Judgment Areti Andreopoulou Audio Acoustics Group, LIMSI - CNRS andreopoulou@limsi.fr Agnieszka Roginska Music and Audio

More information

Citation for published version (APA): Nutma, T. A. (2010). Kac-Moody Symmetries and Gauged Supergravity Groningen: s.n.

Citation for published version (APA): Nutma, T. A. (2010). Kac-Moody Symmetries and Gauged Supergravity Groningen: s.n. University of Groningen Kac-Moody Symmetries and Gauged Supergravity Nutma, Teake IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please

More information

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA)

Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA) H. Lee, Capturing 360 Audio Using an Equal Segment Microphone Array (ESMA), J. Audio Eng. Soc., vol. 67, no. 1/2, pp. 13 26, (2019 January/February.). DOI: https://doi.org/10.17743/jaes.2018.0068 Capturing

More information

Speech Compression. Application Scenarios

Speech Compression. Application Scenarios Speech Compression Application Scenarios Multimedia application Live conversation? Real-time network? Video telephony/conference Yes Yes Business conference with data sharing Yes Yes Distance learning

More information

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor

BEAT DETECTION BY DYNAMIC PROGRAMMING. Racquel Ivy Awuor BEAT DETECTION BY DYNAMIC PROGRAMMING Racquel Ivy Awuor University of Rochester Department of Electrical and Computer Engineering Rochester, NY 14627 rawuor@ur.rochester.edu ABSTRACT A beat is a salient

More information

ENHANCED PRECISION IN SOURCE LOCALIZATION BY USING 3D-INTENSITY ARRAY MODULE

ENHANCED PRECISION IN SOURCE LOCALIZATION BY USING 3D-INTENSITY ARRAY MODULE BeBeC-2016-D11 ENHANCED PRECISION IN SOURCE LOCALIZATION BY USING 3D-INTENSITY ARRAY MODULE 1 Jung-Han Woo, In-Jee Jung, and Jeong-Guon Ih 1 Center for Noise and Vibration Control (NoViC), Department of

More information

Binaural Speaker Recognition for Humanoid Robots

Binaural Speaker Recognition for Humanoid Robots Binaural Speaker Recognition for Humanoid Robots Karim Youssef, Sylvain Argentieri and Jean-Luc Zarader Université Pierre et Marie Curie Institut des Systèmes Intelligents et de Robotique, CNRS UMR 7222

More information

SOPA version 3. SOPA project. July 22, Principle Introduction Direction of propagation Speed of propagation...

SOPA version 3. SOPA project. July 22, Principle Introduction Direction of propagation Speed of propagation... SOPA version 3 SOPA project July 22, 2015 Contents 1 Principle 2 1.1 Introduction............................ 2 1.2 Direction of propagation..................... 3 1.3 Speed of propagation.......................

More information