Robotic Spatial Sound Localization and Its 3-D Sound Human Interface

Similar documents
Virtual Sound Source Positioning and Mixing in 5.1 Implementation on the Real-Time System Genesis

The analysis of multi-channel sound reproduction algorithms using HRTF data

Sound source localization and its use in multimedia applications

Proceedings of Meetings on Acoustics

From Binaural Technology to Virtual Reality

III. Publication III. c 2005 Toni Hirvonen.

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

Spatial Audio Reproduction: Towards Individualized Binaural Sound

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

Sound Source Localization in Median Plane using Artificial Ear

Enhancing 3D Audio Using Blind Bandwidth Extension

Psychoacoustic Cues in Room Size Perception

Direction-Dependent Physical Modeling of Musical Instruments

Spatial audio is a field that

AN AUDITORILY MOTIVATED ANALYSIS METHOD FOR ROOM IMPULSE RESPONSES

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

396 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 2, FEBRUARY 2011

ROOM AND CONCERT HALL ACOUSTICS MEASUREMENTS USING ARRAYS OF CAMERAS AND MICROPHONES

Proceedings of Meetings on Acoustics

Auditory Localization

Proceedings of Meetings on Acoustics

Sound Source Localization using HRTF database

REAL TIME WALKTHROUGH AURALIZATION - THE FIRST YEAR

Study on method of estimating direct arrival using monaural modulation sp. Author(s)Ando, Masaru; Morikawa, Daisuke; Uno

Audio Engineering Society. Convention Paper. Presented at the 129th Convention 2010 November 4 7 San Francisco, CA, USA. Why Ambisonics Does Work

HRIR Customization in the Median Plane via Principal Components Analysis

Computational Perception. Sound localization 2

Measuring impulse responses containing complete spatial information ABSTRACT

WAVELET-BASED SPECTRAL SMOOTHING FOR HEAD-RELATED TRANSFER FUNCTION FILTER DESIGN

A binaural auditory model and applications to spatial sound evaluation

Proceedings of Meetings on Acoustics

3D sound in the telepresence project BEAMING Olesen, Søren Krarup; Markovic, Milos; Madsen, Esben; Hoffmann, Pablo Francisco F.; Hammershøi, Dorte

Listening with Headphones

Acoustics Research Institute

Virtual Acoustic Space as Assistive Technology

DECORRELATION TECHNIQUES FOR THE RENDERING OF APPARENT SOUND SOURCE WIDTH IN 3D AUDIO DISPLAYS. Guillaume Potard, Ian Burnett

Subband Analysis of Time Delay Estimation in STFT Domain

Personalized 3D sound rendering for content creation, delivery, and presentation

Ivan Tashev Microsoft Research

Upper hemisphere sound localization using head-related transfer functions in the median plane and interaural differences

Spatial Audio & The Vestibular System!

A spatial squeezing approach to ambisonic audio compression

THE TEMPORAL and spectral structure of a sound signal

Introduction. 1.1 Surround sound

Sound Processing Technologies for Realistic Sensations in Teleworking

University of Huddersfield Repository

ORIENTATION IN SIMPLE VIRTUAL AUDITORY SPACE CREATED WITH MEASURED HRTF

Convention Paper Presented at the 126th Convention 2009 May 7 10 Munich, Germany

Perceptual effects of visual images on out-of-head localization of sounds produced by binaural recording and reproduction.

Simultaneous Recognition of Speech Commands by a Robot using a Small Microphone Array

Reducing comb filtering on different musical instruments using time delay estimation

3D Sound System with Horizontally Arranged Loudspeakers

HRTF adaptation and pattern learning

Validation of lateral fraction results in room acoustic measurements

INVESTIGATING BINAURAL LOCALISATION ABILITIES FOR PROPOSING A STANDARDISED TESTING ENVIRONMENT FOR BINAURAL SYSTEMS

MEASURING DIRECTIVITIES OF NATURAL SOUND SOURCES WITH A SPHERICAL MICROPHONE ARRAY

Potential and Limits of a High-Density Hemispherical Array of Loudspeakers for Spatial Hearing and Auralization Research

Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal

IMPROVED COCKTAIL-PARTY PROCESSING

Convention Paper Presented at the 139th Convention 2015 October 29 November 1 New York, USA

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 VIRTUAL AUDIO REPRODUCED IN A HEADREST

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

Outline. Context. Aim of our projects. Framework

From acoustic simulation to virtual auditory displays

3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES

Proceedings of Meetings on Acoustics

STÉPHANIE BERTET 13, JÉRÔME DANIEL 1, ETIENNE PARIZET 2, LAËTITIA GROS 1 AND OLIVIER WARUSFEL 3.

DISTANCE CODING AND PERFORMANCE OF THE MARK 5 AND ST350 SOUNDFIELD MICROPHONES AND THEIR SUITABILITY FOR AMBISONIC REPRODUCTION

Localization of underwater moving sound source based on time delay estimation using hydrophone array

ROOM SHAPE AND SIZE ESTIMATION USING DIRECTIONAL IMPULSE RESPONSE MEASUREMENTS

Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings

Computational Perception /785

BINAURAL RECORDING SYSTEM AND SOUND MAP OF MALAGA

Improving room acoustics at low frequencies with multiple loudspeakers and time based room correction

A triangulation method for determining the perceptual center of the head for auditory stimuli

Holographic Measurement of the Acoustical 3D Output by Near Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

Microphone Array Design and Beamforming

Soundfield Navigation using an Array of Higher-Order Ambisonics Microphones

Sound source localization accuracy of ambisonic microphone in anechoic conditions

Human-Robot Interaction in Real Environments by Audio-Visual Integration

PERSONAL 3D AUDIO SYSTEM WITH LOUDSPEAKERS

Multiple Sound Sources Localization Using Energetic Analysis Method

Convention Paper 9870 Presented at the 143 rd Convention 2017 October 18 21, New York, NY, USA

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics

Lecture 14: Source Separation

Holographic Measurement of the 3D Sound Field using Near-Field Scanning by Dave Logan, Wolfgang Klippel, Christian Bellmann, Daniel Knobloch

Convention e-brief 400

Analysis of Frontal Localization in Double Layered Loudspeaker Array System

Multichannel Audio Technologies. More on Surround Sound Microphone Techniques:

Binaural Hearing. Reading: Yost Ch. 12

Localization of 3D Ambisonic Recordings and Ambisonic Virtual Sources

c 2014 Michael Friedman

Acoustics II: Kurt Heutschi recording technique. stereo recording. microphone positioning. surround sound recordings.

Multi-Loudspeaker Reproduction: Surround Sound

Study Of Sound Source Localization Using Music Method In Real Acoustic Environment

Accurate sound reproduction from two loudspeakers in a living room

A Road Traffic Noise Evaluation System Considering A Stereoscopic Sound Field UsingVirtual Reality Technology

Simulation of realistic background noise using multiple loudspeakers

Transcription:

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface Jie Huang, Katsunori Kume, Akira Saji, Masahiro Nishihashi, Teppei Watanabe and William L. Martens The University of Aizu Aizu-Wakamatsu, 965-858, Japan Abstract In this paper, we describe a robotic spatial sound localization system using an auditory interface with four microphones arranged on the surface of a spherical robot head. The time difference and intensity difference from a sound source to different microphones are analyzed by measuring the HRTFs around the spherical head in an anechoic chamber. It was found while the time difference can be approximated by a simple equation, the intensity difference is more complicated for different azimuth, elevation and frequency. A time difference based sound localization method was proposed and was tested by experiments. A sound interface for human listeners is also constructed by four loudspeakers with similar arrangement as the microphone set. This interface can be used as 3-D sound human interface by passing the four channel audio signals from the microphone set directly to the four loudspeakers. It can also be used to create 3-D sound with arbitrary spatial position which is determined by a virtual sound image or by the sound localization system. 1 Introduction Mobile robot technology is an emerging field with wide applications. For example, a mobile robot can serve as a guard robot that can detect suspicious objects by audition and vision. The robot can also be used as an Internetconnected agent robot by which the user can explore a new place without being there. The robot can even attend a meeting instead of its users so that the users can get the remote auditory and visual scenes of the meeting room. For the above purposes, the robot must be capable of treating multimedia resources, including sound media, to complement vision [6, 12]. Visual sensors are the most popular sensors used today for mobile robots. However, since a robot generally looks at the external world from a camera, difficulties will occur when a object does not exist in the visual field of the camera or when the lighting is poor. A robot cannot detect a non-visual event that in many cases with sound emissions. In these situations, the most useful information is provided by audition. Audition is one of the most important senses used by humans and animals to recognize their environments. Although the spatial resolution of audition is relatively low compared with that of vision, the auditory system can complement and cooperate with vision systems. For example, sound localization can enable the robot to direct its camera to a sound source. The auditory system of a mobile robot can also be used for a teleconference system to guide its camera to pick up the faces of speakers automatically [6, 8, 7]. In this paper, we will focus on the techniques of spatial sound localization and its 3-D sound human interface [1, 3, 5, 9, 1]. 2 A Multimodal Telerobot and Its Auditory Interface A multimodal mobile telerobot named HERO (abbreviation of HEaring RObot) is developed (Figure 1) [6, 4, 11]. HERO is equipped with auditory sensors and vision sensor, along with infrared and tactile sensors for obstacle avoidance. The auditory system of HERO has some properties similar to those of the human auditory system, with the aim to incorporate some appropriate features of the human auditory system based on the engineering needs of efficiency and accuracy. It is obviously not the purpose to simulate the human auditory system. The microphones are arranged on the surface of a spherical head (three in the side and one on the top, see Figure 1) with the radius of 15 cm about 1.5 times that of humans. Spatial cues including time difference and intensity difference cues are used for spatial sound localization [2]. A sphere-shaped head can simplify the formulation of time difference calculation. By using the top-mounted microphone, we can localize the elevation of sound sources based on the time difference and/or intensity difference cues without using the relatively uncertain spectral difference cue. As shown in Figure 1, a four-channel sound interface for Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

A telerobot Spatial sound processing sound sources 1-channel 3D sound reconstruction 4 speaker 3D sound interface 4-channel 4-channel 4-to-4 channel 3D sound interface Figure 1. The 4-to-4 channel 3-D sound interface 3-D sound reproduction and reconstruction is constucted. By this 3-D sound interface, 4-channel audio from the robot auditory interface can be directly reproduced without any additional processing. Comparing to a binaural 3-D sound interface which usually uses a headphone, the 4-speaker 3- D sound interface can create a wide 3-D sound field to accommodate more audience. The use of the 3-speaker 3-D sound interface is not limited to reproduce the 3-D sound from the HERO robot, it also can be used as interface for creative virtual 3-D sound. In some cases, e.g. for bandwidth compression, we can send only one channel audio instead of four channels and then reconstruct the four channel 3-D sound after received the signals. where SP is the tangential line to the sphere and P is the contact point. The angle p can be obtained as p = cos 1 (r=d) (2) with the radius of the sphere r and the distance of sound source D. The arrival time from sound source to micro- S Q Y θp P 3 Arrival Time Difference and Intensity Difference Ο θ Μ X 3.1 Theoretical calculation of arrival time differences The arrival time of a sound depends on the shortest path from the source position to the receiving microphone. Considering the case that a microphone (2,3 or 4) and the sound source are in the same horizontal plane with spherical center (the elevation of sound source to be ), the sound path can be calculated as the length of line (curve) S-P-M as shown in Figure 2 ρ SM (» p ) d = (1) SP + arc(pm) ( > p ) Figure 2. The shortest path from sound source S to microphone M phone 1 can be calculated by substituting the by 9 '. When the elevation of sound source is not zero, we will need to rotate the coordinates around axis x to let the elevation to be zero. Let the spherical center to be the origin of the coordinates. The sound source S will be (D cos cos'; D sin cos '; D sin(')), where is the az- Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

imuth and ' is the elevation of the sound source. The rotation angle will be tan ψ = D sin (')) =tan'= sin : (3) D sin cos ' After the rotation, the arrival time than can be calculated as same as the case of zero elevation. Figure 3 shows the calculated arrival time differences. Theoretical arrival time from each direction about Mic 2 Delay time(second) 4 3.5 3 2.5 8 x 1-3 6 4 2 Elevation (degree) 9 Azimuth (degree) Figure 3. Calculated arrival time from sound source to microphone (subtracted by the arrival time from sound source to point Q) 3.2 Comparison with the measured arrival time differences The theoretical arrival time differences were compared with the HRTF (Head Related Transfer Function) data, which were measured for azimuth from to 18 degrees and elevation from to 9 degrees with a 5 degree step. Two methods were used to calculate the arrival time differences from HRTF data. The first method uses the crosscorrelation between the HRTFs of different microphones. The cross correlation method is a widely used method for time difference calculation. Figure 4 displays the obtained arrival time difference together with the theoretically arrival time difference (the smooth curved surface). From the figure, it can be seen that the time differences calculated from the measured HRTFs by the cross correlation method match the theoretical values in general. However, there are several places where we can see a big gap between the measured data and the theoretical values. Those places are around the azimuth of degree and 24 degrees, where the sound 18 Arrival time difference(ms) 3 2 1-1 -2-3 8 6 4 Arrival time difference between mic 4 and 2 2 9 18 Figure 4. Arrival time difference between microphone 4 and 2 source is in the opposite of microphone 2 or microphone 4. This is considered to be influenced by the surface of the sphere. Although the cross correlation method is simple and useful one, in many cases we need to calculate time differences for every frequency components. The other method uses the phase differences between Fourier transformed HRTF data. Since the phase difference data are within range of ( ß»» ß), phase wrapping will occur as shown in Figure 5 (the upper plot). The phase wrapping, however, can be recovered by the following phase shift. ffi = 8 < : ffi +2ß ( ffi < ß) ffi ( ß» ffi» ß) ffi 2ß ( ffi >ß) Let the sampling period be T and f the sampling frequency, the time difference t can be expressed by the following equation t = ffi 2ß T = ffi 2ßf Figure 6 shows the arrival time differences calculated from the phase differences for 5Hz component. The results are very close to that of theoretical values. Similar results can be obtained from other frequency components when the frequency is less than about 13kHz. 27 (4) (5) 36 Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

Phase(degree) Phase difference(degree) 2 1-1 Measured phase(microphone 2, elevation, frequency 5kHz) -2 2 4 6 8 1 12 14 16 18 Calculated phase difference(microphone 2, elevation, freq uency 5k -5-1 -2-25 2 4 6 8 1 12 14 16 18 Figure 5. Phase differences by the measured HRTF data. In the figure, the horizontal axis is the setup azimuth of HRTF data. The phase differences are calculated from 5kHz frequency component. The upper plot shows the phase differences with phase wrapping, and the lower plot shows the phase difference with phase wrapping removed. Delay time(second) 15 1 5-5 8 x 1-4 6 4 2 Frequecy = 5Hz 9 Figure 6. Arrival time difference from phase difference (5Hz) 18 3.3 Theoretical calculation of the intensity differences Comparing with the time differences, the intensity differences for different azimuth, elevation and frequency are very complicated. When a sound meets the spherical head, since the head blocked its way, the sound waves will be curved to go around with the spherical surface. Then, the sound lose energy depending on the frequency, the path length and the radius of the sphere. There will be different sound waves transmit on different side of the sphere. Those sound waves will then finally join again in the opposite position to the sound source. This phenomenon makes the sound intensity and phase complicated, as we have seen the influence for the cross correlation in previous section. As a very rough approximation, we assume the energy lose is simply depending on the length of the path along the spherical surface as shown in Figure 7. Here, we ignored the focusing and scattering effects of the sphere and the phase difference of sound wave from different paths. Path length only along the surface(meter).4.3.2.1 8 6 4 2 Elevation (degree) The path length only along the surface 9 Azimuth (degree) Figure 7. Sound path length along the surface of sphere 3.4 Comparison with the measured intensity differences The sound intensity difference compared with the intensity of sound from front (direction of azimuth and elevation) are calculated using the HRTF data for different azimuth and elevation. 18 R Intensity (ffi; ; f )= I(ffi; ; f ) I(; ; f) (6) Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

One part of the results are shown in Figure 8 and 9 for frequency of 5Hz and 5kHz respectively. Compared with 4 Spatial Sound Localization Azimuth and Elevation Identification 4.1 Sound localization method Rate of attenuation The rate of attenuation (frequecy = 5Hz) 2 1.8 1.6 1.4 1.2 1.8 8 6 4 2 Rate of attenuation 18 9 Figure 8. Sound intensity difference (5Hz) The rate of attenuation (frequecy = 5Hz) 3 2.5 2 1.5 1.5 8 6 4 2 18 9 Figure 9. Sound intensity difference (5Hz) As shown in the previous section, sound arrival time differences can be approximated well by a theoretical equation. The intensity differences, however, are more complicated and difficult to be approximated. Although the intensity differences are useful and important for sound localization in human auditory systems, the auditory system of the HERO robot has spatially arranged four microphones that can be use for azimuth and elevation localization. Using only time difference cues can simplify the sound localization method comparing to that using intensity and spectral cues. The arrival time differences are calculated from the sound waves of different microphones by the crosscorrelation method. By choosing difference microphone pairs, there are six arrival time differences in total tm =( t 12 ; t 13 ; t 14 ; t 23 ; t 24 ; t 34 ) (7) where, the indexes mean the number of microphone. These time differences are compared with the theoretically calculated arrival time differences which are calculated and formed a database in advance. The distance between the theoretical arrival time differences and the input signals are calculated as e( ; ') =k t( ; ') tm k (8) where t( ; ') is the theoretical arrival time differences between a microphone pair. The azimuth and elevation of sound source is then calculated as the ^ and ^' which minimize e( ; '). As shown in the previous section, the measurement errors of arrival time difference will become large when the sound source was positioned behind the sphere from the view point of the microphones. Therefore, it is better to choose microphone pairs in the front side to the sound source, i.e. to choose the microphone pairs with smaller time difference. 4.2 Experiments and results the theoretical values as shown in Figure 7, we can see they have the same trends, i.e. the sound energy loses when the sound waves path along the spherical surface. However, since it is only a very rough approximation, we could only expect its rough features. Sound localization experiments were conducted in an environment of an ordinary room. Testing sounds including coin-dropping, glass-broken and a piece of classic music were used. The sampling frequency was 44:1kHz. The distance of the sound source was set to 1:m. Figure 1 and 11 show the results of sound localization using 3 microphone pairs with minimum time differences. The average localization errors using different number of Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

microphone pairs are shown in Table 1. It is shown that choosing three microphone pairs with smallest arrival time difference achieved the best performance. Calculated elevation(degree) The result using 3 microphone pairs about elevati 9 8 7 6 5 4 3 2 1-1 -1 1 2 3 4 5 6 7 8 9 Acutual elevation(degree) Figure 1. Results of elevation identification using 3 microphone pairs 35 The result using 3 microphone pairs about azimut Table 1. The average localization error using different number of microphone pairs (degrees) number of microphone pairs 2 3 4 5 6 elevation 2.7 2.6 2.7 2.9 2.9 azimuth 2.6 1.5 2.4 2.1 2.1 measured HRTF data. Three subjects performed the listening tests. Sound stimuli of a radio news announcement, a piece of classical orchestra music and a glass-broken sound were used. In the experiments, a multi-track HDD digital recorder (Fostex D824mk2), a 4-channel power amplifier (BOSE 12VI) and 4 speakers (YAMAHA NS-P21) were used. Sampling frequency was 44.1kHz and the bit rate 16bit. Listeners were sitting on a chair with three speakers in the same level as their ears and one speaker on top of their head. All speakers were distanced 1 m from the listeners. The experiments were conducted in an ordinary room in our lab. For each test, the sound stimuli were presented several times. 9 Calculated azimuth(degree) 3 25 2 15 1 5 Elevation (percepted) 8 7 6 5 4 3 2 1 5 1 15 2 25 3 35 Acutual azimuth(degree) Figure 11. Results of azimuth identification using 3 microphone pairs -1-1 1 2 3 4 5 6 7 8 9 Elevation (setup) Figure 12. Elevation recognized by the listeners 5 Evaluation Tests for the Four-channel 3-D Sound Interface To evaluate the four-channel 3-D sound interface, psychological listening tests were conducted. Virtual sound images were created by convoluting sound sources with the The results of the experiments are shown in Figure 12 and Figure 13. Elevation testing was performed for,, 15, 3, 6 and 9 degrees, with the azimuth set to degree. Azimuth testing was performed for, 9 and 18 degrees with the elevation set to low ( degrees), mid ( degree) Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE

Azimuth (percepted) 18 16 14 12 1 8 6 4 2 4 6 8 1 12 14 16 18 Azimuth (setup) Figure 13. Azimuth recognized by the listeners and high (6 degrees) for all cases. From the results, it is clear that the system has the ability of elevation reproduction as well as azimuth. 6 Conclusion In this paper, we describe a robotic spatial sound localization system using a four microphone set arranged on the surface of a spherical robot head. The time difference and intensity difference are analyzed by measuring the HRTFs around the spherical head in an anechoic chamber. It was found while the time difference can be approximated by a simple equation, the intensity difference is more complicated for different azimuth, elevation and frequency. A time difference based sound localization method was proposed and was tested by experiments. A sound interface for human listeners is also constructed by four loudspeakers with similar arrangement as the microphone set. This interface can be used as 3-D sound human interface by passing the four channel audio signal from the microphone set directly to the four loudspeakers. It can also be used to create 3-D sound with arbitrary spatial position which is determined by a virtual sound image or by the sound localization system. [3] K. Brandenburg and M. Bosi. Overview of mpeg audio: Current and future standards for low-bit-rate audio coding. J. Audio Engineering Society, (1-2):4 2, 1997. [4] M. Cohen. A design for integrating the internet chair and a telerobot. In Proc. Int. Conf. Information Society in the 21st Century, pages 276 28, Aizu-Wakamatsu, Nov. 2. U. Aizu. [5] M. Gerzon. Periphony: with-height sound reproduction. J. Audio Engineering Society, 21(1-2):2 1, 1973. [6] J. Huang. Spatial sound processing for a hearing robot. In Q. Jin, editor, Enabling Society with Information Technology, pages 197 26. Springer-Verlag, 21. [7] J. Huang, N. Ohnishi, and N. Sugie. Building ears for robots: Sound localization and separation. Artificial Life and Robotics, 1(4):157 163, 1997. [8] J. Huang, T. Supaongprapa, I. Terakura, F. Wang, N. Ohnishi, and N. Sugie. A model based sound localization system and its application to robot navigation. Robotics and Autonomous Systems, 27(4):199 29, 1999. [9] J. Huopaniemi. Virtual Acoustics and 3-D Sound in Multimedia Signal Processing. Ph.D. dissertation, Helsinki University, Dep. Electrical and Communications Engineering, 1999. [1] D. Malham and A. Myatt. 3-D sound spatialization using ambisonic techniques. Computer Music Journal, 19(4):58 7, 1995. [11] W. L. Martens. Pseudophonic listening in reverberant environments: Implications for optimizing auditory display for the human user of a telerobotic listening system. In Proc. Int. Conf. Information Society in the 21st Century, pages 269 275, Aizu-Wakamatsu, Nov. 2. U. Aizu. [12] H. G. Okuno, K. Nakadai, T. Lourens, and H. Kitano. Sound and visual tracking for humanoid. In Proc. Int. Conf. Information Society in the 21st Century, pages 254 261, Aizu- Wakamatsu, Nov. 2. U. Aizu. References [1] D. R. Begault. 3-D Sound for Virtual Reality and Multimedia. AP Professional, Boston, 1994. [2] J. Blauert. Spatial hearing: the psychophysics of human sound localization. The MIT Press, London, revised edition, 1997. Proceedings of the First International Symposium on Cyber Worlds (CW 2) -7695-1862-1/2 $17. 22 IEEE