20 JUNE 2011 Multi-core Platforms for Immersive-Audio Applications Course: Advanced Computer Architectures Teacher: Prof. Cristina Silvano Student: Silvio La Blasca 771338
Introduction on Immersive-Audio Immersive Audio Systems: Sound Rendering: Wave Field Synthesis Standard Implementation is GPP based: Sound Acquisition: Beamforming Computational intensive algorithms PRO: easily programmable short development time CONs: processing bottlenecks and excessive power consumption Alternative implementations: Basedon Graphic Processing Units (GPUs) and Field ProgrammableGate Array (FPGAs) Exploit multi core parallelism to achieve speed up and power saving CASE STUDY: Multi core Platforms for Beamforming and Wave Field Synthesis Theodoropoulos, Kuzmanov & Gaydadjiev (2011) Course: Advanced Computer Architectures Teacher: Prof. Cristina Silvano Student: Silvio La Blasca 20-06-2011
Introduction on Immersive-Audio Beamforming (BF): spatial filtering technique that allows estimation of the direction of arrival of an audio signal in order to perform source separation Example of Filter and sum approach: each microphone channel signal fed to a FIR that acts as a delay line FIR coefficients are set according to source position (beamsteering) Wave Field Synthesis: (WFS) spatial audio reproduction technique that uses loudspeaker arrays to generate a soundwave field over a wide area (no seet spot) Example of WFS array: for each sample the distance between the source and each loudspeaker must be calculated l distance affect the signal amplitude and delay
Introduction on Immersive-Audio Implementations in literature: Ultrasound imaging with 288 channels using 14 FPGA and a GPU connected to a PC BF Delay and sum beamformer implementation on a Nvidia GeForce 8800 Teleconferencing system based on Texas Instrument DSP (TMS320C6201) WFS SonicEmotion WFS system based on Intel Core2 Duo for 24 speakers Higher number of speakers is supported by PC based implementation developed by Iosono and Delft University GPU based implementation using Nvidia GeForce GTX285 and Tesla C1060 Considerations: Most of the implementations rely on standard GPP Little or none performance evaluation is done with respect to processor features GOAL: implement BF and WFS on GPUs and FPGAs, with a rough high level design space exploration and evaluate performances
Proposed Implementation GPU Implementation of Beamforming: Program flow: 1. STORE input data into GPU main memory 2. STORE FIRcoefficients into GPU main memory 3. Perform DECIMATION for each channel 4. For each source perform: SOURCE EXTRACTION for each channel + MEM ACCUMULATION of all channels + MEM INTERPOLATION of source signal + MEM 5. STORE extracted source Main features: 2 kernel: a flexible one for all FIR computations and one for accumulation #blocks match the #samples, #threads match filter size Need of up to 2 MB space foro store FIR coefficients
Proposed Implementation GPU Implementation of WFS: Program flow: 1. STORE input data into GPU main memory 2. For each source perform: FIR Filtering + MEM CALCULATE all speakers signals and ACCUMULATE with previous 3. STORE all speakers signals into GPU main memory Main features: Use the previous FIR kernel for Filtering Each block of WFS kernel process a chunk of samples
Proposed Implementation FPGA Implementation: Beamforming module: APU accelerate communication with host processor FCM BF controller initiates the processing phases All samples stored in the local buffer and all channels are processed concurrently FIR filter include coefficients banks Wave Field Synthesis module: Samples are filtered and stored in the WFS Engine buffer to be processed in parallel The loudspeakers signal computing is distributed to the Rendering Unit according to the available resources The Preprocessorcalculates the sourcedistance A complete prototype of Immersive Audio reconfigurable processor uses an FPGA for BF and WFS acceleration and 2 PowerPC as host
Performance analysis Hardware characteristics: FPGAs: Performance evaluation with Xilinx tools Number of BF channels and WFS RU has been considered for each FPGA GPUs: Performance evaluation with FF XIV benchmark The GTX460 is the only one with two levels of on chip cache GPP: AscomparisonA i reference GPP an Intel lcore2 Duo @ 3.0 GHz has been used
Performance analysis Performance results VS Core2 Duo: All execution times include memory access delays Real time threshold is at 11264 ms Beamforming 8 channels 16 channels 32 speakers WFS 96 speakers
Performance analysis Performance results: GTX275 VS FPGA In order to compare the processing speed, memory access delay is subtracted from execution time: in the GPU it accounts for more than 50% of the overall execution time Optimized GTX275 performances are evaluated considering the different integration technology: processing time reduction is estimated referring to ITRS About power performances: DSPs and FPGAs require much lower power than GPP and FPGAs are more performing than GPUs The amount of power is highly affected by the number of microphones and loudspeakers
Conclusions General considerations: BF applications benefits from multi core platforms since signals can be processed cuncurrently and GPUs and FPGA can achieve about an order of magnitude of speed up WFS is even more advantaged and proposed implementation can run up to two order of magnitude faster than GPP solution With respect to standard PC, GPUs can reduce power consumption by 2.5 times and FPGAs even more Further considerations: Multi core platforms are at the moment the best approach to increase performances of Immersive Audio systems Though parallelization is particularly effective for Immersive Audio processing, as any computational intensive application it would benefit from processing speed up and memory delay reduction Despite a variety of experiments on both acquisition and rendering techniques implementation on different platforms, acomplete system has not been proposed p
References Theodoropoulos, D.; Kuzmanov, G.; Gaydadjiev, dji G; Multi Core Platforms for Beamforming and Wave Field Synthesis IEEE Transactions on Multimedia, Vol. 13, No.2, April 2011 D. Theodoropoulos, C. B. Ciobanu, and G. Kuzmanov, Wave field synthesis for 3D audio: Architectural prospectives, in Proc. ACM Int. Conf. Computing Frontiers, May 2009, pp. 127 136. Thanksforyourattention