Vampir Getting Started. Holger Brunst March 4th 2008

Size: px
Start display at page:

Download "Vampir Getting Started. Holger Brunst March 4th 2008"

Transcription

1 Vampir Getting Started Holger Brunst March 4th 2008

2 What is Vampir? Program Monitoring, Visualization, and Analysis 1. Step: VampirTrace monitors your program s runtime behavior Event-based Produces detailed event traces rather than tabular profiles Information is recorded in OpenTraceFormat (OTF) 2. Step: Vampir visualizes the program s event trace graphically Zooming Timeline charts Statistics and profiles

3 Vampir History PARvis at Research Center Jülich Vampir at Research Center Jülich Vampir at ZIH, Technische Universität Dresden was commercially available via Pallas GmbH, later via Intel Vampir was developed by ZIH, Dresden Successor: VampirServer (VNG) client/server architecture distributed event storage improved scalability works on clusters and SMPs

4 Vampir Versions Vampir + VampirTrace 3.0 Vampir + VampirTrace 4.0 / Intel Trace Analyzer Intel Trace Collector 4.0 Vampir 5.2 Intel Trace Analyzer 7.1 VampirTrace VampirServer 1.8 Intel Trace Collector 7.1

5 Monitoring: VampirTrace Open source program monitor Available from Technische Universität Dresden, ZIH Google for tu dresden and vampirtrace Recorded events Function entry/exit MPI and OpenMP events Hardware/software performance counters (e.g. PAPI) OS events: Process creation, resource management Collected event properties Time stamp Location (process / thread / MPI) MPI specifics like message size, etc.

6 Performance Data: OTF Trace Format Open source trace file format (OTF) Available from Technische Universität Dresden, ZIH Google for tu dresden and otf Includes powerful libotf to be used in custom applications Human readable Fast searching and indexing Snapshots On-the-fly compression Actively developed In cooperation with the University of Oregon (TAU) and Lawrence Livermore National Laboratory VI-HPS

7 Visualization: Vampir & VampirServer Vampir: All-in-one performance visualization (small problem sizes) VampirServer: Distributed high-end performance visualization Client/server architecture Parallel event processing Runs on a (part of a) production environment No need to transfer huge traces Parallel I/O VampirServer (Browser) Lightweight client on local workstation Outer appearance identical to Vampir Receives visual content only Already adapted to display resolution (but no images) Moderate network bandwidth and latency requirements Scales to trace data volumes >100 GB

8 Event-based Performance Analysis 1 CPU CPU Application Monitor Application Monitor Trace Data Time Performance Visualization Enable Scalability? 2 CPU Application Monitor 3 4 CPU CPU... Application Monitor Application Monitor Trace Data 10,000 CPU Application Monitor

9 SCALABLE TOOL ARCHITECTURE Performance Visualization Trace Data Part 1 Trace Data Part 2 Trace Data Trace Data Part 3... Trace Data Part m Parallel Data Analysis Server Worker 1 Worker 2 Worker 3... Worker n Boss

10 Vampir Displays: Global Process Timeline Shows timed app. events: processes functions resp. function groups messages collective ops I/O activities Context information on click Supports zooming (horizontal & vertical) Can control reference time interval of other displays

11 Vampir Displays: Global Process Timeline Global Process Timeline with Thumbnail View

12 Vampir Displays: Global Process Timeline Zoom

13 Vampir Displays: Global Process Timeline Zoom Further

14 Vampir Displays: Process Timeline Timeline for single processes Unfolded call stack in vertical dimension Shows: functions/function groups messages, collective communication I/O activities performance counter values Supports zooming Context display

15 Vampir Displays: Process Timeline + MFLOPS Single Process Timeline with MFLOPS Graph

16 Vampir Displays: Process Timeline + Counters Single Process Timeline with MFLOPS and Instructions per Second Graph

17 Vampir Displays: Summary Timeline Histogram of program activities (Imagine a 90º rotated profile/summary chart over time) Shows the number of processes doing a certain activity

18 Vampir Displays: Summary Timeline Zoom

19 Vampir Displays: Summary Chart Statistical overview over functions or groups of functions Modes: Bar chart, pie chart, table Global (for all processes) or local (for a single process) Exclusive time, inclusive time, occurrences Absolute or logarithmic scale Absolute values or percentages Supports zooming Automatically adjusts to the current timeline interval

20 Vampir Displays: Summary Chart Grouped / Comprehensive Function Statistics

21 Vampir Displays: Summary Chart Pie and Table Representations

22 Vampir Displays: Message Statistics Sender/receiver matrix Zooming Displayed message properties: length, rate duration count Optionally as: minimum, maximum, average, sum Message length histogram

23 Vampir Displays: Message Statistics Message Statistics with Thumbnail

24 Vampir Displays: Message Statistics Zoomed Message Statistics

25 Vampir Displays: Call Tree Shows function call hierarchy Provides caller/callee relationship Invocation count Min/max runtime Level folding Searching

26 Vampir Displays: Call Tree

27 Vampir Displays: Global Filters Ignore certain items globally Processes/threads or groups of them Messages by communicator or by tag Collective operations by communicator or by type I/O events by communicator by file by read/write access

28 Vampir Displays: Event Filters Event Filter Dialog

29 Vampir Displays: Process Filter

30 Performance Flaws in Communication Communication as such ( computation (domination over Late sender, late receiver Point-to-point messages instead of collective communication Unmatched messages Overcharge of MPI s buffers ( bandwidth ) Bursts of large messages ( latency ) Frequent short messages ( barrier ) Unnecessary synchronization

31 Further Performance Flaws Memory bound computation inefficient L1/L2/L3 cache usage TLB misses detectable via HW performance counters I/O bound computation slow input/output sequential I/O on single process I/O load imbalance Exception handling

32 Performance Effects of Tracing Measurement overhead can be grave for tiny function calls solve with selective instrumentation Long, asynchronous trace buffer flushes Too many concurrent counters more data Heisenbugs

33 Examples (1) Idle OpenMP Threads

34 Examples (2) Unbalanced Computation/Communication

35 Examples (3) Low FP rate due to heavy FP exceptions

36 Examples (4) Performance Visualization of Cell/B.E.

37 Examples (5) Cell/B.E. with DMA

38 Examples (6) Thread Creation and Load Balancing Issues

39 Examples (7) Improved Thread Creation

40 Vampir at VI-HPS: VampirTrace Monitor (1) Type: module load vampir Supports OpenMP and MPI VampirTrace provides compiler wrappers for automatic program instrumentation: vtcc, vtcxx, vtf77, vtf90 Supported compilers: GNU (gcc, g++, gfortran, g95) Intel version 9.x (icc, icpc, ifort) Intel version 10.0 (icc, icpc, ifort) Portland Group (PGI) (pgcc, pgcc, pgf90, pgf77) SUN Fortran 90 (cc, CC, f90) IBM (xlcc, xlcc, xlf90) NEC SX (sxcc, sxc++, sxf90)

41 Vampir at VI-HPS: VampirTrace Monitor (2) Replace cc by vtcc in your makefile and you are set Alternative instrumentation types: Manual instrumentation API DynInst binary instrumentation Performance Counters PAPI processor counters I/O bandwidth tracking Memory allocation counters User defined counters Example: smg2000

42 Vampir at VI-HPS: VampirServer (1) Logon Xeon Cluster ssh -Y Load VampirServer module: module load vampir-server Launch VampirServer daemon: vngd-start.sh -p 300?? Results in: Launching VampirServer Version on 5 MPI processes Found license file: /opt/vampir-server/1.8.0/lic.dat Running... Server listens on: linuxhtc02.rz.rwth-aachen.de:300??

43 Vampir at VI-HPS: VampirServer (2) Results in: Launching VampirServer Version on 5 MPI processes Found license file: /opt/vampirserver/1.8.0/lic.dat Running... Server listens on: node02:30034 Launch visualization client in separate shell with: module load vampir-server vng a localhost p 300??

44 Thank You!

45 References A. Knüpfer, H. Brunst, and W. E. Nagel: High Performance Trace Visualization. In Proc. of 13th Euromicro Conference on Parallel, Distributed and Network-based Processing, Lugano, Switzerland, February H. Brunst, D. Kranzlmüller, W. E. Nagel: Tools for Scalable Parallel Program Analysis - Vampir NG and DeWiz. Distributed and Parallel Systems, Cluster and Grid Computing, Springer, Kluwer Intl. Series in Engineering and Computer Science, Vol 777, Budapest, Hungary, 2004 H. Brunst, H.-Ch. Hoppe, W. E. Nagel, and M. Winkler: Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach. Proc. of ICCS 2001, San Francisco, CA, USA, May ,Springer LNCS 2074, pp , 2001 Vampir User Guide (Available in download archive)

TU Dresden, Center for Information Services and HPC (ZIH) ALWAYS ON? ENVISIONING FULLY-INTEGRATED PERMANENT MONITORING IN PARALLEL APPLICATIONS

TU Dresden, Center for Information Services and HPC (ZIH) ALWAYS ON? ENVISIONING FULLY-INTEGRATED PERMANENT MONITORING IN PARALLEL APPLICATIONS TU Dresden, Center for Information Services and HPC (ZIH) ALWAYS ON? ENVISIONING FULLY-INTEGRATED PERMANENT MONITORING IN PARALLEL APPLICATIONS Past Achievements: Score-P Community Software Since 2007/2009

More information

The Ghost in the Machine Observing the Effects of Kernel Operation on Parallel Application Performance

The Ghost in the Machine Observing the Effects of Kernel Operation on Parallel Application Performance The Ghost in the Machine Observing the Effects of Kernel Operation on Parallel Application Performance Aroon Nataraj, Alan Morris, Allen Malony, Matthew Sottile, Pete Beckman l {anataraj, amorris, malony,

More information

Introduction to VI-HPS

Introduction to VI-HPS Introduction to VI-HPS Martin Schulz Technische Universität München Virtual Institute High Productivity Supercomputing Goal: Improve the quality and accelerate the development process of complex simulation

More information

BSC TOOLS: INSTRUMENTATION & ANALYSIS

BSC TOOLS: INSTRUMENTATION & ANALYSIS www.bsc.es BSC TOOLS: INSTRUMENTATION & ANALYSIS Juan Gonzalez JLPC Summer School, Sophia Antipolis, June 2014 Why tools? Measurements as science enablers Vital for app. development at Exascale Flight

More information

23rd VI-HPS Tuning Workshop & LLNL Performance Tools Deep-Dive

23rd VI-HPS Tuning Workshop & LLNL Performance Tools Deep-Dive 23rd VI-HPS Tuning Workshop & LLNL Performance Tools Deep-Dive http://www.vi-hps.org/training/tws/tw23.html https://computing.llnl.gov/training/2016/2016.07.27-29.html https://lc.llnl.gov/confluence/display/tools/

More information

22nd VI-HPS Tuning Workshop PATC Performance Analysis Workshop

22nd VI-HPS Tuning Workshop PATC Performance Analysis Workshop 22nd VI-HPS Tuning Workshop PATC Performance Analysis Workshop http://www.vi-hps.org/training/tws/tw22.html Marc-André Hermanns Jülich Supercomputing Centre Sameer Shende University of Oregon Florent Lebeau

More information

Modular Performance Analysis

Modular Performance Analysis Modular Performance Analysis Lothar Thiele Simon Perathoner, Ernesto Wandeler ETH Zurich, Switzerland 1 Embedded Systems Computation/Communication Resource Interaction 2 Models of Computation How can we

More information

Lab MIC Offload Experiments 11/13/13 offload_lab.tar TACC

Lab MIC Offload Experiments 11/13/13 offload_lab.tar TACC Lab MIC Offload Experiments 11/13/13 offload_lab.tar TACC # pg. Subject Purpose directory 1 3 5 Offload, Begin (C) (F90) Compile and Run (CPU, MIC, Offload) hello 2 7 Offload, Data Optimize Offload Data

More information

What can POP do for you?

What can POP do for you? What can POP do for you? Mike Dewar, NAG Ltd EU H2020 Center of Excellence (CoE) 1 October 2015 31 March 2018 Grant Agreement No 676553 Outline Overview of codes investigated Code audit & plan examples

More information

CS4961 Parallel Programming. Lecture 1: Introduction 08/24/2010. Course Details Time and Location: TuTh, 9:10-10:30 AM, WEB L112 Course Website

CS4961 Parallel Programming. Lecture 1: Introduction 08/24/2010. Course Details Time and Location: TuTh, 9:10-10:30 AM, WEB L112 Course Website Parallel Programming Lecture 1: Introduction Mary Hall August 24, 2010 1 Course Details Time and Location: TuTh, 9:10-10:30 AM, WEB L112 Course Website - http://www.eng.utah.edu/~cs4961/ Instructor: Mary

More information

Real Time Operating Systems Lecture 29.1

Real Time Operating Systems Lecture 29.1 Real Time Operating Systems Lecture 29.1 EE345M Final Exam study guide (Spring 2014): Final is both a closed and open book exam. During the closed book part you can have a pencil, pen and eraser. During

More information

COTSon: Infrastructure for system-level simulation

COTSon: Infrastructure for system-level simulation COTSon: Infrastructure for system-level simulation Ayose Falcón, Paolo Faraboschi, Daniel Ortega HP Labs Exascale Computing Lab http://sites.google.com/site/hplabscotson MICRO-41 tutorial November 9, 28

More information

Application of Maxwell Equations to Human Body Modelling

Application of Maxwell Equations to Human Body Modelling Application of Maxwell Equations to Human Body Modelling Fumie Costen Room E, E0c at Sackville Street Building, fc@cs.man.ac.uk The University of Manchester, U.K. February 5, 0 Fumie Costen Room E, E0c

More information

Experience with new architectures: moving from HELIOS to Marconi

Experience with new architectures: moving from HELIOS to Marconi Experience with new architectures: moving from HELIOS to Marconi Serhiy Mochalskyy, Roman Hatzky 3 rd Accelerated Computing For Fusion Workshop November 28 29 th, 2016, Saclay, France High Level Support

More information

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir Parallel Computing 2020: Preparing for the Post-Moore Era Marc Snir THE (CMOS) WORLD IS ENDING NEXT DECADE So says the International Technology Roadmap for Semiconductors (ITRS) 2 End of CMOS? IN THE LONG

More information

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka

Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Simulation Performance Optimization of Virtual Prototypes Sammidi Mounika, B S Renuka Abstract Virtual prototyping is becoming increasingly important to embedded software developers, engineers, managers

More information

24th VI-HPS Tuning Workshop PATC course in conjunction with POP CoE

24th VI-HPS Tuning Workshop PATC course in conjunction with POP CoE 24th VI-HPS Tuning Workshop PATC course in conjunction with POP CoE http://www.vi-hps.org/training/tws/tw24.html Judit Giménez & Lau Mercadal Barcelona Supercomputing Centre Michael Bareford EPCC Wadud

More information

Programming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Labs CDT 102

Programming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Labs CDT 102 Programming and Optimization with Intel Xeon Phi Coprocessors Colfax Developer Training One-day Labs CDT 102 Abstract: Colfax Developer Training (CDT) is an in-depth intensive course on efficient parallel

More information

Challenges in Transition

Challenges in Transition Challenges in Transition Keynote talk at International Workshop on Software Engineering Methods for Parallel and High Performance Applications (SEM4HPC 2016) 1 Kazuaki Ishizaki IBM Research Tokyo kiszk@acm.org

More information

28th VI-HPS Tuning Workshop UCL, London, June 2018

28th VI-HPS Tuning Workshop UCL, London, June 2018 28th VI-HPS Tuning Workshop UCL, London, 19-21 June 2018 http://www.vi-hps.org/training/tws/tw28.html Judit Giménez & Lau Mercadal Barcelona Supercomputing Centre Michael Bareford EPCC Cédric Valensi &

More information

Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team

Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team Modeling the Office of Science ten year facilities plan: The PERI Architecture Tiger Team Bronis R. de Supinski 1, Sadaf Alam 2, David H. Bailey 3, Laura Carrington 4, Chris Daley 5, Anshu Dubey 5, Todd

More information

Document downloaded from:

Document downloaded from: Document downloaded from: http://hdl.handle.net/1251/64738 This paper must be cited as: Reaño González, C.; Pérez López, F.; Silla Jiménez, F. (215). On the design of a demo for exhibiting rcuda. 15th

More information

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood Supporting x86-64 Address Translation for 100s of GPU s Jason Power, Mark D. Hill, David A. Wood Summary Challenges: CPU&GPUs physically integrated, but logically separate; This reduces theoretical bandwidth,

More information

Track and Vertex Reconstruction on GPUs for the Mu3e Experiment

Track and Vertex Reconstruction on GPUs for the Mu3e Experiment Track and Vertex Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e Collaboration GPU Computing in High Energy Physics, Pisa September 11th, 2014 Physikalisches Institut Heidelberg

More information

(Theory-Practice-Lab) Credit BBM 1511 Introduction to Computer Engineering - 1 (2-0-0) 2

(Theory-Practice-Lab) Credit BBM 1511 Introduction to Computer Engineering - 1 (2-0-0) 2 ARAS Brief Course Descriptions (Theory-Practice-Lab) Credit BBM 1511 Introduction to Computer Engineering - 1 (2-0-0) 2 Basic Concepts in Computer Science / Computer Systems and Peripherals / Introduction

More information

Table of Contents HOL ADV

Table of Contents HOL ADV Table of Contents Lab Overview - - Horizon 7.1: Graphics Acceleartion for 3D Workloads and vgpu... 2 Lab Guidance... 3 Module 1-3D Options in Horizon 7 (15 minutes - Basic)... 5 Introduction... 6 3D Desktop

More information

OCEAN DATA SYSTEMS The Art of Industrial Intelligence. User Friendly & Programming Free Reporting. Product Overview. Dream Report

OCEAN DATA SYSTEMS The Art of Industrial Intelligence. User Friendly & Programming Free Reporting. Product Overview. Dream Report Dream Report OCEAN DATA SYSTEMS The Art of Industrial Intelligence User Friendly & Programming Free Reporting. Dream Report Product Overview Applications Compliance Performance Quality Corporate Dashboards

More information

Product Overview. Dream Report. OCEAN DATA SYSTEMS The Art of Industrial Intelligence. User Friendly & Programming Free Reporting.

Product Overview. Dream Report. OCEAN DATA SYSTEMS The Art of Industrial Intelligence. User Friendly & Programming Free Reporting. Dream Report OCEAN DATA SYSTEMS The Art of Industrial Intelligence User Friendly & Programming Free Reporting. Dream Report for DGH Modules Dream Report Product Overview Applications Compliance Performance

More information

Modeling & Simulation Capability for Consequence Management

Modeling & Simulation Capability for Consequence Management Modeling & Simulation Capability for Consequence Management Vic Baker Advanced Systems Technologies Mid-Atlantic Technology, Research & Innovation Center (MATRIC) Morgantown, WV, USA vic.baker@matricresearch.com

More information

CUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads

CUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads Terminology CUDA Threads Bedrich Benes, Ph.D. Purdue University Department of Computer Graphics Streaming Multiprocessor (SM) A SM processes block of threads Streaming Processors (SP) also called CUDA

More information

Simulating the Power Consumption of Large-Scale Sensor Network Applications

Simulating the Power Consumption of Large-Scale Sensor Network Applications Simulating the Power Consumption of Large-Scale Sensor Network Applications Victor Shnayder, Mark Hempstead, Bor-rong Chen, Geoff Werner Allen, and Matt Welsh Harvard University shnayder@eecs.harvard.edu

More information

NRC Workshop on NASA s Modeling, Simulation, and Information Systems and Processing Technology

NRC Workshop on NASA s Modeling, Simulation, and Information Systems and Processing Technology NRC Workshop on NASA s Modeling, Simulation, and Information Systems and Processing Technology Bronson Messer Director of Science National Center for Computational Sciences & Senior R&D Staff Oak Ridge

More information

Apache Spark Performance Troubleshooting at Scale: Challenges, Tools and Methods

Apache Spark Performance Troubleshooting at Scale: Challenges, Tools and Methods Apache Spark Performance Troubleshooting at Scale: Challenges, Tools and Methods Luca Canali, CERN About Luca Computing engineer and team lead at CERN IT Hadoop and Spark service, database services Joined

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Mistakes in Graphical Presentation CS 147: Computer Systems Performance Analysis Mistakes in Graphical Presentation 1 / 45 Overview Excess Information Multiple

More information

Evaluation of CPU Frequency Transition Latency

Evaluation of CPU Frequency Transition Latency Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz 1 Alexandre Laurent 1 Benoît Pradelle 1 William Jalby 1 1 University of Versailles Saint-Quentin-en-Yvelines, France ENA-HPC 2013, Dresden

More information

SIMGRAPH - A FLIGHT SIMULATION DATA VISUALIZATION WORKSTATION. Joseph A. Kaplan NASA Langley Research Center Hampton, Virginia

SIMGRAPH - A FLIGHT SIMULATION DATA VISUALIZATION WORKSTATION. Joseph A. Kaplan NASA Langley Research Center Hampton, Virginia SIMGRAPH - A FLIGHT SIMULATION DATA VISUALIZATION WORKSTATION Joseph A. Kaplan NASA Langley Research Center Hampton, Virginia Patrick S. Kenney UNISYS Corporation Hampton, Virginia Abstract Today's modern

More information

Early Adopter : Multiprocessor Programming in the Undergraduate Program. NSF/TCPP Curriculum: Early Adoption at the University of Central Florida

Early Adopter : Multiprocessor Programming in the Undergraduate Program. NSF/TCPP Curriculum: Early Adoption at the University of Central Florida Early Adopter : Multiprocessor Programming in the Undergraduate Program NSF/TCPP Curriculum: Early Adoption at the University of Central Florida Narsingh Deo Damian Dechev Mahadevan Vasudevan Department

More information

The Message Passing Interface (MPI)

The Message Passing Interface (MPI) The Message Passing Interface (MPI) MPI is a message passing library standard which can be used in conjunction with conventional programming languages such as C, C++ or Fortran. MPI is based on the point-to-point

More information

Important Considerations For Graphical Representations Of Data

Important Considerations For Graphical Representations Of Data This document will help you identify important considerations when using graphs (also called charts) to represent your data. First, it is crucial to understand how to create good graphs. Then, an overview

More information

Design of Parallel Algorithms. Communication Algorithms

Design of Parallel Algorithms. Communication Algorithms + Design of Parallel Algorithms Communication Algorithms + Topic Overview n One-to-All Broadcast and All-to-One Reduction n All-to-All Broadcast and Reduction n All-Reduce and Prefix-Sum Operations n Scatter

More information

Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs

Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs Exploiting Coarse-Grained Task, Data, and Pipeline Parallelism in Stream Programs Michael Gordon, William Thies, and Saman Amarasinghe Massachusetts Institute of Technology ASPLOS October 2006 San Jose,

More information

LS-DYNA Performance Enhancement of Fan Blade Off Simulation on Cray XC40

LS-DYNA Performance Enhancement of Fan Blade Off Simulation on Cray XC40 LS-DYNA Performance Enhancement of Fan Blade Off Simulation on Cray XC40 Ting-Ting Zhu, Cray Inc. Jason Wang, LSTC Brian Wainscott, LSTC Abstract This work uses LS-DYNA to enhance the performance of engine

More information

A Parallel Monte-Carlo Tree Search Algorithm

A Parallel Monte-Carlo Tree Search Algorithm A Parallel Monte-Carlo Tree Search Algorithm Tristan Cazenave and Nicolas Jouandeau LIASD, Université Paris 8, 93526, Saint-Denis, France cazenave@ai.univ-paris8.fr n@ai.univ-paris8.fr Abstract. Monte-Carlo

More information

Keysight Technologies N9051B Pulse Measurement Software X-Series Signal Analyzers. Technical Overview

Keysight Technologies N9051B Pulse Measurement Software X-Series Signal Analyzers. Technical Overview Keysight Technologies N9051B Pulse Measurement Software X-Series Signal Analyzers Technical Overview 02 Keysight N9051B Pulse Measurement Software X-Series Signal Analyzers - Technical Overview Features

More information

RAPS ECMWF. RAPS Chairman. 20th ORAP Forum Slide 1

RAPS ECMWF. RAPS Chairman. 20th ORAP Forum Slide 1 RAPS George.Mozdzynski@ecmwf.int RAPS Chairman 20th ORAP Forum Slide 1 20th ORAP Forum Slide 2 What is RAPS? Real Applications on Parallel Systems European Software Initiative RAPS Consortium (founded

More information

Benchmarking C++ From video games to algorithmic trading. Alexander Radchenko

Benchmarking C++ From video games to algorithmic trading. Alexander Radchenko Benchmarking C++ From video games to algorithmic trading Alexander Radchenko Quiz. How long it takes to run? 3.5GHz Xeon at CentOS 7 Write your name Write your guess as a single number Write time units

More information

Evaluation of CPU Frequency Transition Latency

Evaluation of CPU Frequency Transition Latency Noname manuscript No. (will be inserted by the editor) Evaluation of CPU Frequency Transition Latency Abdelhafid Mazouz Alexandre Laurent Benoît Pradelle William Jalby Abstract Dynamic Voltage and Frequency

More information

Hybrid QR Factorization Algorithm for High Performance Computing Architectures. Peter Vouras Naval Research Laboratory Radar Division

Hybrid QR Factorization Algorithm for High Performance Computing Architectures. Peter Vouras Naval Research Laboratory Radar Division Hybrid QR Factorization Algorithm for High Performance Computing Architectures Peter Vouras Naval Research Laboratory Radar Division 8/1/21 Professor G.G.L. Meyer Johns Hopkins University Parallel Computing

More information

Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes

Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes Rachata Ausavarungnirun Joshua Landgraf Vance Miller Saugata Ghose Jayneel Gandhi Christopher J. Rossbach Onur

More information

An architecture for Scalable Concurrent Embedded Software" No more communication in your program, the key to multi-core and distributed programming.

An architecture for Scalable Concurrent Embedded Software No more communication in your program, the key to multi-core and distributed programming. An architecture for Scalable Concurrent Embedded Software" No more communication in your program, the key to multi-core and distributed programming. Eric.Verhulst@altreonic.com www.altreonic.com 1 Content

More information

Like Mobile Games* Currently a Distinguished i Engineer at Zynga, and CTO of FarmVille 2: Country Escape (for ios/android/kindle)

Like Mobile Games* Currently a Distinguished i Engineer at Zynga, and CTO of FarmVille 2: Country Escape (for ios/android/kindle) Console Games Are Just Like Mobile Games* (* well, not really. But they are more alike than you think ) Hi, I m Brian Currently a Distinguished i Engineer at Zynga, and CTO of FarmVille 2: Country Escape

More information

Session 3 _ Part A Effective Coordination with Revit Models

Session 3 _ Part A Effective Coordination with Revit Models Session 3 _ Part A Effective Coordination with Revit Models Class Description Effective coordination relies upon a measured strategic approach to using clash detection software. This class will share best

More information

Console Games Are Just Like Mobile Games* (* well, not really. But they are more alike than you

Console Games Are Just Like Mobile Games* (* well, not really. But they are more alike than you Console Games Are Just Like Mobile Games* (* well, not really. But they are more alike than you think ) Hi, I m Brian Currently a Software Architect at Zynga, and CTO of CastleVille Legends (for ios/android)

More information

Extending and Using GNU Radio Performance Counters

Extending and Using GNU Radio Performance Counters Extending and Using GNU Radio Performance Counters Using the Linux Perf API Nathan West September 18, 2014 Nathan West Extending and Using GNU Radio Performance Counters September 18, 2014 1 / 19 Abstract

More information

A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server

A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server Youngsik Kim * * Department of Game and Multimedia Engineering, Korea Polytechnic University, Republic

More information

4th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2016)

4th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2016) 4th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2016) held in conjunction with Euro-Par 2016 Carsten Clauss, Stefan Lankes Topics of interest Idea Predecessor: MARC Symposium

More information

GPU-accelerated track reconstruction in the ALICE High Level Trigger

GPU-accelerated track reconstruction in the ALICE High Level Trigger GPU-accelerated track reconstruction in the ALICE High Level Trigger David Rohr for the ALICE Collaboration Frankfurt Institute for Advanced Studies CHEP 2016, San Francisco ALICE at the LHC The Large

More information

ArcGIS Runtime SDK for Java: Building Applications. Eric

ArcGIS Runtime SDK for Java: Building Applications. Eric ArcGIS Runtime SDK for Java: Building Applications Eric Bader @ECBader Agenda ArcGIS Runtime and the SDK for Java How to build / Functionality - Maps, Layers and Visualization - Geometry Engine - Routing

More information

Saphira Robot Control Architecture

Saphira Robot Control Architecture Saphira Robot Control Architecture Saphira Version 8.1.0 Kurt Konolige SRI International April, 2002 Copyright 2002 Kurt Konolige SRI International, Menlo Park, California 1 Saphira and Aria System Overview

More information

NetApp Sizing Guidelines for MEDITECH Environments

NetApp Sizing Guidelines for MEDITECH Environments Technical Report NetApp Sizing Guidelines for MEDITECH Environments Brahmanna Chowdary Kodavali, NetApp March 2016 TR-4190 TABLE OF CONTENTS 1 Introduction... 4 1.1 Scope...4 1.2 Audience...5 2 MEDITECH

More information

Parallelism Across the Curriculum

Parallelism Across the Curriculum Parallelism Across the Curriculum John E. Howland Department of Computer Science Trinity University One Trinity Place San Antonio, Texas 78212-7200 Voice: (210) 999-7364 Fax: (210) 999-7477 E-mail: jhowland@trinity.edu

More information

Microarchitectural Attacks and Defenses in JavaScript

Microarchitectural Attacks and Defenses in JavaScript Microarchitectural Attacks and Defenses in JavaScript Michael Schwarz, Daniel Gruss, Moritz Lipp 25.01.2018 www.iaik.tugraz.at 1 Michael Schwarz, Daniel Gruss, Moritz Lipp www.iaik.tugraz.at Microarchitecture

More information

Final Report: DBmbench

Final Report: DBmbench 18-741 Final Report: DBmbench Yan Ke (yke@cs.cmu.edu) Justin Weisz (jweisz@cs.cmu.edu) Dec. 8, 2006 1 Introduction Conventional database benchmarks, such as the TPC-C and TPC-H, are extremely computationally

More information

SAMA SATELLITE ACCESS MANAGER

SAMA SATELLITE ACCESS MANAGER sama_af.fh11 8/3/10 14:10 P gina 1 Satellite communications, earth observation, navigation and positioning and control stations specifications Features Trade-off analysis, optimal transponder utilization,

More information

ANSYS v14.5. Manager Installation Guide CAE Associates

ANSYS v14.5. Manager Installation Guide CAE Associates ANSYS v14.5 Remote Solve Manager Installation Guide 2013 CAE Associates What is the Remote Solve Manager? The Remote Solve Manager (RSM) is a job queuing system designed specifically for use with the ANSYS

More information

An Introduction to Load Balancing CCSM3 Components

An Introduction to Load Balancing CCSM3 Components An Introduction to Load Balancing CCSM3 Components CCSM Workshop June 23, 2005 Breckenridge, CO The National Center for Atmospheric Research is funded by the National Science Foundation. 1 Overview CCSM3

More information

GC for interactive and real-time systems

GC for interactive and real-time systems GC for interactive and real-time systems Interactive or real-time app concerns Reducing length of garbage collection pause Demands guarantees for worst case performance Generational GC works if: Young

More information

Windows INSTRUCTION MANUAL

Windows INSTRUCTION MANUAL Windows E INSTRUCTION MANUAL Contents About This Manual... 3 Main Features and Structure... 4 Operation Flow... 5 System Requirements... 8 Supported Image Formats... 8 1 Installing the Software... 1-1

More information

Drum Leveler. User Manual. Drum Leveler v Sound Radix Ltd. All Rights Reserved

Drum Leveler. User Manual. Drum Leveler v Sound Radix Ltd. All Rights Reserved 1 Drum Leveler User Manual 2 Overview Drum Leveler is a new beat detection-based downward and upward compressor/expander. By selectively applying gain to single drum beats, Drum Leveler easily achieves

More information

Huawei ilab Superior Experience. Research Report on Pokémon Go's Requirements for Mobile Bearer Networks. Released by Huawei ilab

Huawei ilab Superior Experience. Research Report on Pokémon Go's Requirements for Mobile Bearer Networks. Released by Huawei ilab Huawei ilab Superior Experience Research Report on Pokémon Go's Requirements for Mobile Bearer Networks Released by Huawei ilab Document Description The document analyzes Pokémon Go, a global-popular game,

More information

Traffic Monitoring and Management for UCS

Traffic Monitoring and Management for UCS Traffic Monitoring and Management for UCS Session ID- Steve McQuerry, CCIE # 6108, UCS Technical Marketing @smcquerry www.ciscolivevirtual.com Agenda UCS Networking Overview Network Statistics in UCSM

More information

Project 5: Optimizer Jason Ansel

Project 5: Optimizer Jason Ansel Project 5: Optimizer Jason Ansel Overview Project guidelines Benchmarking Library OoO CPUs Project Guidelines Use optimizations from lectures as your arsenal If you decide to implement one, look at Whale

More information

Development of a parallel, tree-based neighbour-search algorithm

Development of a parallel, tree-based neighbour-search algorithm Mitglied der Helmholtz-Gemeinschaft Development of a parallel, tree-based neighbour-search algorithm for the tree-code PEPC 28.09.2010 Andreas Breslau Outline 1 Motivation 2 Short introduction to tree-codes

More information

A High Definition Motion JPEG Encoder Based on Epuma Platform

A High Definition Motion JPEG Encoder Based on Epuma Platform Available online at www.sciencedirect.com Procedia Engineering 29 (2012) 2371 2375 2012 International Workshop on Information and Electronics Engineering (IWIEE) A High Definition Motion JPEG Encoder Based

More information

Large-scale Stability and Performance of the Ceph File System

Large-scale Stability and Performance of the Ceph File System Large-scale Stability and Performance of the Ceph File System Vault 2017 Patrick Donnelly Software Engineer 2017 March 22 Introduction to Ceph Distributed storage All components scale horizontally No single

More information

Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best

Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best Elementary Plots Why Should We Care? Everyone uses plotting But most people ignore or are unaware of simple principles Default plotting tools are not always the best More importantly, it is easy to lie

More information

Towards Real-Time Volunteer Distributed Computing

Towards Real-Time Volunteer Distributed Computing Towards Real-Time Volunteer Distributed Computing Sangho Yi 1, Emmanuel Jeannot 2, Derrick Kondo 1, David P. Anderson 3 1 INRIA MESCAL, 2 RUNTIME, France 3 UC Berkeley, USA Motivation Push towards large-scale,

More information

Data Quality Monitoring of the CMS Pixel Detector

Data Quality Monitoring of the CMS Pixel Detector Data Quality Monitoring of the CMS Pixel Detector 1 * Purdue University Department of Physics, 525 Northwestern Ave, West Lafayette, IN 47906 USA E-mail: petra.merkel@cern.ch We present the CMS Pixel Data

More information

Building a Cell Ecosystem. David A. Bader

Building a Cell Ecosystem. David A. Bader Building a Cell Ecosystem David A. Bader Acknowledgment of Support National Science Foundation CSR: A Framework for Optimizing Scientific Applications (06-14915) CAREER: High-Performance Algorithms for

More information

Computational Scalability of Large Size Image Dissemination

Computational Scalability of Large Size Image Dissemination Computational Scalability of Large Size Image Dissemination Rob Kooper* a, Peter Bajcsy a a National Center for Super Computing Applications University of Illinois, 1205 W. Clark St., Urbana, IL 61801

More information

EFFICIENT IMPLEMENTATIONS OF OPERATIONS ON RUNLENGTH-REPRESENTED IMAGES

EFFICIENT IMPLEMENTATIONS OF OPERATIONS ON RUNLENGTH-REPRESENTED IMAGES EFFICIENT IMPLEMENTATIONS OF OPERATIONS ON RUNLENGTH-REPRESENTED IMAGES Øyvind Ryan Department of Informatics, Group for Digital Signal Processing and Image Analysis, University of Oslo, P.O Box 18 Blindern,

More information

Parallel Storage and Retrieval of Pixmap Images

Parallel Storage and Retrieval of Pixmap Images Parallel Storage and Retrieval of Pixmap Images Roger D. Hersch Ecole Polytechnique Federale de Lausanne Lausanne, Switzerland Abstract Professionals in various fields such as medical imaging, biology

More information

Characterizing, Optimizing, and Auto-Tuning Applications for Energy Efficiency

Characterizing, Optimizing, and Auto-Tuning Applications for Energy Efficiency PhD Dissertation Proposal Characterizing, Optimizing, and Auto-Tuning Applications for Efficiency Wei Wang The Committee: Chair: Dr. John Cavazos Member: Dr. Guang R. Gao Member: Dr. James Clause Member:

More information

Contribution to the Smecy Project

Contribution to the Smecy Project Alessio Pascucci Contribution to the Smecy Project Study some performance critical parts of Signal Processing Applications Study the parallelization methodology in order to achieve best performances on

More information

Statistical Pulse Measurements using USB Power Sensors

Statistical Pulse Measurements using USB Power Sensors Statistical Pulse Measurements using USB Power Sensors Today s modern USB Power Sensors are capable of many advanced power measurements. These Power Sensors are capable of demodulating the signal and processing

More information

TeleTrader FlashChart

TeleTrader FlashChart TeleTrader FlashChart Symbols and Chart Settings With TeleTrader FlashChart you can display several symbols (for example indices, securities or currency pairs) in an interactive chart. You can also add

More information

Top 10 Things at Esri UC 2015 & ArcGIS Pro

Top 10 Things at Esri UC 2015 & ArcGIS Pro Top 10 Things at Esri UC 2015 & ArcGIS Pro Jim Tochterman, VP - Research & Development Who is BCS? Formed in 1998 in Aiken, SC Privately held and woman-owned Primary focus is GIS for Public Safety Esri

More information

STK110. Chapter 2: Tabular and Graphical Methods Lecture 1 of 2. ritakeller.com. mathspig.wordpress.com

STK110. Chapter 2: Tabular and Graphical Methods Lecture 1 of 2. ritakeller.com. mathspig.wordpress.com STK110 Chapter 2: Tabular and Graphical Methods Lecture 1 of 2 ritakeller.com mathspig.wordpress.com Frequency distribution Example Data from a sample of 50 soft drink purchases Frequency Distribution

More information

Actors Play backend role for Internet of Things

Actors Play backend role for Internet of Things Actors Play backend role for Internet of Things Grzegorz Kossakowski @gkossakowski Early Draft Scala Camp, Kraków May 2014 Internet of Things The Internet of Things (IoT) refers to uniquely identifiable

More information

Introduction to Real-Time Systems

Introduction to Real-Time Systems Introduction to Real-Time Systems Real-Time Systems, Lecture 1 Martina Maggio and Karl-Erik Årzén 16 January 2018 Lund University, Department of Automatic Control Content [Real-Time Control System: Chapter

More information

COMET DISTRIBUTED ELEVATOR CONTROLLER CASE STUDY

COMET DISTRIBUTED ELEVATOR CONTROLLER CASE STUDY COMET DISTRIBUTED ELEVATOR CONTROLLER CASE STUDY System Description: The distributed system has multiple nodes interconnected via LAN and all communications between nodes are via loosely coupled message

More information

Stress Testing the OpenSimulator Virtual World Server

Stress Testing the OpenSimulator Virtual World Server Stress Testing the OpenSimulator Virtual World Server Introduction OpenSimulator (http://opensimulator.org) is an open source project building a general purpose virtual world simulator. As part of a larger

More information

Getting Started Guide

Getting Started Guide MaxEye Digital Audio and Video Signal Generation ISDB-T Signal Generation Toolkit Version 2.0.0 Getting Started Guide Contents 1 Introduction... 3 2 Installed File Location... 3 2.1 Soft Front Panel...

More information

Fast and Scalable Eigensolvers for Multicore and Hybrid Architectures

Fast and Scalable Eigensolvers for Multicore and Hybrid Architectures Fast and Scalable Eigensolvers for Multicore and Hybrid Architectures Paolo Bientinesi AICES, RWTH Aachen pauldj@aices.rwth-aachen.de 40th SPEEDUP Workshop on High-Performance Computing February 6 7, 2012

More information

Enhancing System Architecture by Modelling the Flash Translation Layer

Enhancing System Architecture by Modelling the Flash Translation Layer Enhancing System Architecture by Modelling the Flash Translation Layer Robert Sykes Sr. Dir. Firmware August 2014 OCZ Storage Solutions A Toshiba Group Company Introduction This presentation will discuss

More information

Fixed-function (FF) implementation for PSoC 3 and PSoC 5LP devices

Fixed-function (FF) implementation for PSoC 3 and PSoC 5LP devices 3.30 Features 8- or 16-bit resolution Multiple pulse width output modes Configurable trigger Configurable capture Configurable hardware/software enable Configurable dead band Multiple configurable kill

More information

Appendix A ACE exam objectives map

Appendix A ACE exam objectives map A 1 Appendix A ACE exam objectives map This appendix covers these additional topics: A ACE exam objectives for Photoshop CS6, with references to corresponding coverage in ILT Series courseware. A 2 Photoshop

More information

IMPROVING SCALABILITY IN MMOGS - A NEW ARCHITECTURE -

IMPROVING SCALABILITY IN MMOGS - A NEW ARCHITECTURE - IMPROVING SCALABILITY IN MMOGS - A NEW ARCHITECTURE - by Philippe David & Ariel Vardi Georgia Institute of Technology Outline 1.MMOGs: tremendous growth 2.Traditional MMOGs architecture and its flaws 3.Related

More information

Understanding OpenGL

Understanding OpenGL This document provides an overview of the OpenGL implementation in Boris Red. About OpenGL OpenGL is a cross-platform standard for 3D acceleration. GL stands for graphics library. Open refers to the ongoing,

More information

Electron Microscopy RADIUS. Control & Imaging Software. RADIUS - The way forward in electron microscopy

Electron Microscopy RADIUS. Control & Imaging Software. RADIUS - The way forward in electron microscopy RADIUS - The way forward in electron microscopy Electron Microscopy RADIUS Control & Imaging Software THE ESSENCE OF ELECTRON MICROSCOPY: RADIUS RADIUS is the visionary software for electron microscopy

More information