3D sound in the telepresence project BEAMING Olesen, Søren Krarup; Markovic, Milos; Madsen, Esben; Hoffmann, Pablo Francisco F.; Hammershøi, Dorte

Similar documents
Aalborg Universitet. Published in: Acustica United with Acta Acustica. Publication date: Document Version Early version, also known as pre-print

Aalborg Universitet Usage of measured reverberation tail in a binaural room impulse response synthesis General rights Take down policy

Introduction. 1.1 Surround sound

Directional dependence of loudness and binaural summation Sørensen, Michael Friis; Lydolf, Morten; Frandsen, Peder Christian; Møller, Henrik

Aalborg Universitet. Audibility of time switching in dynamic binaural synthesis Hoffmann, Pablo Francisco F.; Møller, Henrik

THE PERCEPTION OF ALL-PASS COMPONENTS IN TRANSFER FUNCTIONS

From Binaural Technology to Virtual Reality

A Virtual Audio Environment for Testing Dummy- Head HRTFs modeling Real Life Situations

Published in: Proceedings of NAM 98, Nordic Acoustical Meeting, September 6-9, 1998, Stockholm, Sweden

Proceedings of Meetings on Acoustics

Antenna Diversity on a UMTS HandHeld Phone Pedersen, Gert F.; Nielsen, Jesper Ødum; Olesen, Kim; Kovacs, Istvan

VIRTUAL ACOUSTICS: OPPORTUNITIES AND LIMITS OF SPATIAL SOUND REPRODUCTION

Low frequency sound reproduction in irregular rooms using CABS (Control Acoustic Bass System) Celestinos, Adrian; Nielsen, Sofus Birkedal

Published in: Proceedings of the 8th International Conference on Tangible, Embedded and Embodied Interaction

Impact of the size of the hearing aid on the mobile phone near fields Bonev, Ivan Bonev; Franek, Ondrej; Pedersen, Gert F.

University of Huddersfield Repository

Externalization in binaural synthesis: effects of recording environment and measurement procedure

Citation for published version (APA): Parigi, D. (2013). Performance-Aided Design (PAD). A&D Skriftserie, 78,

Binaural auralization based on spherical-harmonics beamforming

Boldt, Jesper Bünsow; Kjems, Ulrik; Pedersen, Michael Syskind; Lunner, Thomas; Wang, DeLiang

Comparison of binaural microphones for externalization of sounds

Aalborg Universitet. Emulating Wired Backhaul with Wireless Network Coding Thomsen, Henning; Carvalho, Elisabeth De; Popovski, Petar

Robotic Spatial Sound Localization and Its 3-D Sound Human Interface

Sound source localization and its use in multimedia applications

Published in: 2017 International Conference on Electromagnetics in Advanced Applications (ICEAA)

A virtual headphone based on wave field synthesis

Log-periodic dipole antenna with low cross-polarization

Microphone Array Design and Beamforming

Computation of Delay Spread using 3D Measurements Nielsen, Jesper Ødum; Pedersen, Gert F.; Olesen, Kim; Kovács, István

The relation between perceived apparent source width and interaural cross-correlation in sound reproduction spaces with low reverberation

Low-Profile Fabry-Pérot Cavity Antenna with Metamaterial SRR Cells for Fifth Generation Systems

Enhancing 3D Audio Using Blind Bandwidth Extension

A Switchable 3D-Coverage Phased Array Antenna Package for 5G Mobile Terminals Parchin, Naser Ojaroudi; Shen, Ming; Zhang, Shuai; Pedersen, Gert F.

Evaluation of the Danish Safety by Design in Construction Framework (SDCF)

The Danish Test Facilities Megavind Offspring

On the creation of standards for interaction between real robots and virtual worlds

Low-Cost Planar MM-Wave Phased Array Antenna for Use in Mobile Satellite (MSAT) Platforms Parchin, Naser Ojaroudi; Shen, Ming; Pedersen, Gert F.

The future of illustrated sound in programme making

Waves Nx VIRTUAL REALITY AUDIO

Aalborg Universitet. Published in: Acustica United with Acta Acustica. Publication date: Document Version Early version, also known as pre-print

Aalborg Universitet. Published in: Antennas and Propagation (EuCAP), th European Conference on

Regenerating High Resolution Data from a Lower Resolution Weather Radar

Gamescape Principles Basic Approaches for Studying Visual Grammar and Game Literacy Nobaew, Banphot; Ryberg, Thomas

Proceedings of Meetings on Acoustics

An Audio-Haptic Mobile Guide for Non-Visual Navigation and Orientation

- applications on same or different network node of the workstation - portability of application software - multiple displays - open architecture

Spatial Audio Reproduction: Towards Individualized Binaural Sound

Aalborg Universitet. The immediate effects of a triple helix collaboration Brix, Jacob. Publication date: 2017

SOPA version 2. Revised July SOPA project. September 21, Introduction 2. 2 Basic concept 3. 3 Capturing spatial audio 4

Virtual Reality for Real Estate a case study

High Gain K-Band Patch Antenna for Low Earth Orbit Interlink Between Nanosatellites Squadrito, Paolo; Zhang, Shuai; Pedersen, Gert F.

The workspace design concept: A new framework of participatory ergonomics

Aalborg Universitet. MEMS Tunable Antennas to Address LTE 600 MHz-bands Barrio, Samantha Caporal Del; Morris, Art; Pedersen, Gert F.

Magnusson, Charlotte; Rassmus-Gröhn, Kirsten; Szymczak, Delphine

Published in: Proceedings of the 15th International Joint Conference on e-business and Telecommunications (ICETE 2018)

HeroX - Untethered VR Training in Sync'ed Physical Spaces

Body-Worn Spiral Monopole Antenna for Body-Centric Communications

University of Huddersfield Repository

Development of a telepresence agent

3D AUDIO AR/VR CAPTURE AND REPRODUCTION SETUP FOR AURALIZATION OF SOUNDSCAPES

Effect of ohmic heating parameters on inactivation of enzymes and quality of not-fromconcentrate

VIRTUAL REALITY Introduction. Emil M. Petriu SITE, University of Ottawa

Scanning laser Doppler vibrometry

Evaluation of a new stereophonic reproduction method with moving sweet spot using a binaural localization model

From acoustic simulation to virtual auditory displays

URBANA-CHAMPAIGN. CS 498PS Audio Computing Lab. 3D and Virtual Sound. Paris Smaragdis. paris.cs.illinois.

Speech Compression. Application Scenarios

HRTF adaptation and pattern learning

Internal active power reserve management in Large scale PV Power Plants

Distance Protection of Cross-Bonded Transmission Cable-Systems

Microwave Radiometer Linearity Measured by Simple Means

Adaptive Human aware Navigation based on Motion Pattern Analysis Hansen, Søren Tranberg; Svenstrup, Mikael; Andersen, Hans Jørgen; Bak, Thomas

Delivering Object-Based 3D Audio Using The Web Audio API And The Audio Definition Model

Virtual Acoustic Space as Assistive Technology

Visual and audio communication between visitors of virtual worlds

Aalborg Universitet. Large-Scale Analysis of Art Proportions Jensen, Karl Kristoffer. Published in: Arts and Technology

Novel approaches towards more realistic listening environments for experiments in complex acoustic scenes

Published in: Proceedings of the Workshop on What to Study in HCI at CHI 2015 Conference on Human Factors in Computing Systems

Personalized 3D sound rendering for content creation, delivery, and presentation

Spatial Audio Transmission Technology for Multi-point Mobile Voice Chat

SPAT. Binaural Encoding Tool. Multiformat Room Acoustic Simulation & Localization Processor. Flux All rights reserved

University of Huddersfield Repository

Outline. Context. Aim of our projects. Framework

The Game Experience Questionnaire

Proceedings of Meetings on Acoustics

Spherical Arrays for Wireless Channel Characterization and Emulation Franek, Ondrej; Pedersen, Gert F.

An Optimized Version of a New Absolute Linear Encoder Dedicated to Intelligent Transportation Systems

Published in: IECON 2016: The 42nd Annual Conference of IEEE Industrial Electronics Society

Aalborg Universitet. Binaural Technique Hammershøi, Dorte; Møller, Henrik. Published in: Communication Acoustics. Publication date: 2005

Remote Media Immersion (RMI)

Test of pan and zoom tools in visual and non-visual audio haptic environments. Magnusson, Charlotte; Gutierrez, Teresa; Rassmus-Gröhn, Kirsten

- Modifying the histogram by changing the frequency of occurrence of each gray scale value may improve the image quality and enhance the contrast.

24. TONMEISTERTAGUNG VDT INTERNATIONAL CONVENTION, November Alexander Lindau*, Stefan Weinzierl*

Virtual Co-Location for Crime Scene Investigation and Going Beyond

SIMULATION OF SMALL HEAD-MOVEMENTS ON A VIRTUAL AUDIO DISPLAY USING HEADPHONE PLAYBACK AND HRTF SYNTHESIS. György Wersényi

Published in: Proceedings of the 2004 International Symposium on Spread Spectrum Techniques and Applications

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

Citation for published version (APA): De Vos, L., & Frigaard, P. (2005). Wave Run-Up Offshore Windturbine Foundations.

Published in: Proceedings of the 16th Conference on Power Electronics and Applications, EPE 14-ECCE Europe

Transcription:

Aalborg Universitet 3D sound in the telepresence project BEAMING Olesen, Søren Krarup; Markovic, Milos; Madsen, Esben; Hoffmann, Pablo Francisco F.; Hammershøi, Dorte Published in: Proceedings of BNAM2012 Publication date: 2012 Document Version Early version, also known as pre-print Link to publication from Aalborg University Citation for published version (APA): Olesen, S. K., Markovic, M., Madsen, E., Hoffmann, P. F., & Hammershøi, D. (2012). 3D sound in the telepresence project BEAMING. In P. Juhl (Ed.), Proceedings of BNAM2012 Odense: Nordic Acoustic Association. Joint Baltic-Nordic Acoustics Meeting (BNAM), Proceedings General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.? Users may download and print one copy of any publication from the public portal for the purpose of private study or research.? You may not further distribute the material or use it for any profit-making activity or commercial gain? You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us at vbn@aub.aau.dk providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from vbn.aau.dk on: september 17, 2018

3D Sound in the Telepresence Project BEAMING Søren Krarup Olesen Aalborg University, Section of Acoustics, Fredrik Bajers Vej 7B, DK-9220, Aalborg Ø, Denmark, sko@es.aau.dk Miloš Marković Aalborg University, Section of Acoustics, Fredrik Bajers Vej 7B, DK-9220, Aalborg Ø, Denmark, mio@es.aau.dk Esben Madsen Aalborg University, Section of Acoustics, Fredrik Bajers Vej 7B, DK-9220, Aalborg Ø, Denmark, em@es.aau.dk Pablo Faundez Hoffmann Aalborg University, Section of Acoustics, Fredrik Bajers Vej 7B, DK-9220, Aalborg Ø, Denmark, pfh@es.aau.dk Dorte Hammershøi Aalborg University, Section of Acoustics, Fredrik Bajers Vej 7B, DK-9220, Aalborg Ø, Denmark, dh@es.aau.dk The involvement of Aalborg University in the EU project BEAMING will be presented. BEAMING deals with telepresence including multiple modalities; vision, haptics and audio, of which the latter is of main interest here. The setup consists of two types of locations: The Destination, where the Locals reside, and the Transporter, where the Visitor resides. The Visitors will virtually visit the Locals through an Internet connection and both must feel the other party being physically present. All Locals wear close-up microphones and their positions are tracked. In order to support presence for the Visitor, 3D audio is provided through headphones. It is rendered based on the Locals' coordinates via a common Internet database including local positional tracking to ensure that information on the Visitor's head rotation has a minimum delay through the network. The BEAMING project currently addresses three applications: A general purpose theatrical scene, a teaching situation and a medical patient-visiting-doctor scenario. The March 2012 project review deals with the teaching situation. This involves a single microphone recording followed by signal processing that reconstructs the spatial content. The Visitor is represented as a robot with a loudspeaker. 1 Introduction Immersive experience represents one of the primary goals of modern communication technologies. Communication with the feeling of being together and sharing the same environment is required [1]. The 4 year EU project BEAMING [2] which started in 2010 (Framework Programme 7) is addressing the issue of improving immersive communication interfaces through telepresence. BEAMING is a collaborate research project where the goal is to give people a real sense of physically being present in a remote location with other people. The section of Acoustics at Aalborg University (AAU) is involved in this project as the only project partner who deals with the acoustical aspects. Close technical partners are University College London (UCL) and Technical University Munich (TUM), who deal with vision and robotics/haptics respectively. Scuola Superiore Sant'Anna (SSSA), Pisa, is jointly responsible for system integration and Starlab, Barcelona, mainly holds the project coordinating role. BEAMING includes cross-modal research. This is accommodated across different work packages (WPs) in order to balance the needed basic research with the technical requirements associated to the capturing and rendering of the space where the users of the BEAMING system are physically located. For example, WP1 deals with capturing and rendering of the destination and WP2 deals with capturing and rendering of the visitor. Since BEAMING investigates communication between different locations, all modality data plus various extra controlling data must be transferred through a network. The network can be local (LAN) or the Internet, which in the

latter case poses problems about synchronization, delays and even lost data. The project has currently not reached the state where cross-modal synchronization has been thoroughly investigated, hence AAU/audio (and others) has mainly focused on absolute delays and data loss. 1.1 Roles The user of the BEAMING telepresence systems can take one of three different roles, being either 1) Visitor, 2) Local or 3) Spectator. There can be one or more of the above, although normally there will be several Locals, typically forming a conference where they (the Locals) meet in person. The standard scenario is that the Locals (e.g. at a meeting) are visited by someone not physically present, i.e. the Visitor, and observed by someone not physically present, i.e. the Spectator. The Visitor must be able to interact with the Locals, so they can see his gestures and facial expressions and hear his voice. For this to happen the Visitor must have either a physical/mechanical representation or a holographic representation of himself at the location of the Locals (the Destination). Such a representation (the Avatar) can be (simplest) a computer screen or (more advanced) a robot, depending on the application (see 1.2). In the physical case loudspeakers are mounted on the devices in order to support audio (see 2 for details). Presenting the Visitor to the Locals is handled by WP2. Also the Locals must be presented to the Visitor in a convincing way. Since the Visitors are not physically present at the Destination, they will view the Locals through a head-mounted display (HMD) or use a CAVE. They will wear gloves to provide haptic stimuli and headphones and microphone for playback and capturing of sound. The concept is that the Locals primarily are situated in a normal room wearing as little equipment as possible, while the Visitors will be in less natural surroundings (the Transporter) wearing more complex and sofar heavy equipment. Presenting the Locals to the Visitor is handled by WP1. The Spectator can only view and hear the scene but is not able to interact. His equipment could be a smartphone or a PC. 1.2 Applications BEAMING currently addresses three applications. The applications are meant for testing and are not demonstrators which must all work at the end of the project. They are, 1) A general purpose theatrical scene, where a remote actor (Visitor) can practise a play together with another actor and a director who are both Locals, 2) A medical scenario where the patient visits the doctor. For the sake of the patient, he does not wear an HMD. Instead focus has been on haptic interaction and the doctor is simply displayed on a normal computer screen, 3) A teaching situation, exemplified during the March 2012 project review in Munich, as a teacher (the Visitor) training a novice (a Local) in playing the xylophone. The teacher's Avatar is a robot. The following paragraphs will concentrate on application 3 because it entails specific acoustical challenges beyond those met in 1 and 2. 2 Audio in BEAMING The perceived auditory realism in a telepresence system is important. It was decided to utilize binaural techniques [3,4] to provide spatial information to those users who need it. The binaural processing includes usage of digital filters known as Head Related Transfer Functions (HRTFs) which are sets of transfer functions that represent the change of sound from a given direction to the two human ears in a free field condition. Ideally the set of HRTFs should be measured for each individual who uses the system, but a database based on measurements on a dummy head is also applicable [5,6]. The HRTF database employed in this system is based on 2 degree angular resolution measurements of the Valdemar dummy head developed at AAU [7]. The BEAMING users who need synthesized spatial audio (3D sound) are the Visitors and the Spectators. The Locals get spatial sound automatically, either provided directly by other Locals or by the Avatar(s) as they physically move about at the Destination. With only few modifications an audio setup can be created that covers all three BEAMING applications (see 1.2 and Figure 1) [8].

Figure 1: Top: Capturing and rendering of the Destination, WP1. Bottom: Capturing and rendering of the Visitor, WP2. Figure 1 shows the setup demonstrated during the project review in Munich. It should be noted that the two figures must be viewed as a merger; for example the Visitor will wear both headphones and a microphone at the same time. For simplicity we shall mainly discuss the elements related to audio. The central entity is the Audio server. It is responsible for providing an Internet IP to which the Visitor client can connect. Also a Locals client can connect and an Avatar client, which happens locally if the Audio server is physically placed at the Destination. There is one Visitor client per Visitor but several Locals must share a common Locals client because a multi-channel soundcard is installed on this PC and in order to keep the number of computers to a minimum. The particular setup of Figure 1 is an example of the 3 rd application where the second Local is the xylophone and not a person. The Visitor is the teacher who visits the student (Local) and give instructions in playing. The Local wears a close-up microphone and the xylophone is captured with a studio microphone on a stand. Thus the teacher can hear both the student and the xylophone as 3D sound. The 3D sound is generated by the Visitor client where the HRTF database and convolution software is placed. During the review the teacher (Visitor) was represented at the Destination as a Kali-type robot. It is capable of showing various facial expressions and move its arms so that it can physically play the xylophone. A loudspeaker is mounted below the robot's head which makes the Visitor able to speak via the robot, that is the teacher can both show how to play and give instructions to the student. The loudspeaker chosen is a spherical closed-box with a 2 unit connected to the soundscard on the robot. The Visitor wears a close-up microphone. Tracking is needed both for the Visitor and for the Locals. The tracking information which contains position and rotation of everyone involved is continuously stored in a common Internet database known as The Beaming Scene. In the case of audio it is mainly used by the Visitor client. By combining positions of Locals (incl. the xylophone), the

position/rotation of the Avatar (robot) and the rotation of the Visitor himself, it is possible to select the correct HRTF and produce corresponding 3D sound to the Visitor. This means that currently no 3D/binaural sound needs to be passed through the network. All filtering by HRTFs is done locally at the Transporter side. 3 Discussion A system for 3D sound in the telepresence project BEAMING has been developed. Its current state is a prototype which mostly needs finish, e.g. with the graphical user interface, the implementation of a Spectator role (see 1.1) and bug tracking. With two years left of the BEAMING project, the software, and the whole audio concept used requires testing in terms of timing and delay, synchronization with other modalities and potential data loss through the network connections, as well as experiments with human subjects. Optimization was carried out in transferring sound from the xylophone (see 2). Since a xylophone is a distributed sound source it cannot acoustically be represented by the signal filtered with a single HRTF (as currently is the case of the Locals). Instead the xylophone was captured with one studio microphone. This signal was sent via the Audio server to the Visitor client, where the spatial and distributed nature of the xylophone would be recreated and rendered employing several HRTFs. The advantage is that only a single audio channel must be passed through the network. The algorithm is reported by Marković et al [9]. References [1] Y. Huang, J. Chen and J. Benesty, Immersive audio schemes, Signal Processing Magazine, IEEE, 28(1), 2011, 20-32. [2] BEAMING Project, the official website http://beaming-eu.org/ [3] H. Møller, Fundamentals of Binaural Technology, Applied Acoustics, 36(3/4), 1992, 171-218. [4] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization, The MIT Press, 1983. [5] H. Møller, M. F. Sørensen, C. B. Jensen and D. Hammershøi, Binaural technique: Do we need individual recordings?, Journal of Audio Engineering Society, 44(6), 1996, 451-469. [6] D. R. Begault, E. M. Wenzel, A. S. Lee and M. R. Anderson, Direct comparison of the impact of head tracking, reverberation, and individualized Head-related Transfer Functions on the spatial perception of a virtual speech source, Proceeding of 108 th Audio Engineering Society Convention, Paris, 2000, 1-19. [7] B. P. Bovbjerg, F. Christensen, P. Minnaar and X. Chen, Measuring the head-related transfer functions of an artificial head with a high directional resolution, Proceedings of 109 th Audio Engineering Society Convention, Los Angeles, 2000, 1-17. [8] E. Madsen, S. K. Olesen, M. Marković, P. F. Hoffmann and D. Hammershøi, Setup for demonstrating interactive binaural synthesis for telepresence applications, Proceedings of Forum Acusticum 2011, Aalborg, 1281-1286. [9] M. Marković, E. Madsen, S. K. Olesen, P. F. Hoffmann and D. Hammershøi, BEAMING teaching application: recording techniques for spatial xylophone sound rendering, To be presented at Acoustics 2012, Hongkong.