Use Nvidia Performance Primitives (NPP) in Deep Learning Training. Yang Song

Size: px
Start display at page:

Download "Use Nvidia Performance Primitives (NPP) in Deep Learning Training. Yang Song"

Transcription

1 Use Nvidia Performance Primitives (NPP) in Deep Learning Training Yang Song

2 Outline Introduction Function Categories Performance Results Deep Learning Specific Further Information

3 What is NPP? Image+Signal Library, runs on CUDA-enable GPUs Huge amount High performance Over 6,000 in CUDA 9.0, still growing 5x ~ 100x than CPU-only implementation No GPU kernels required. Ease of use Integrate well with the existing project. Primitives recombine to solve wide range of problems.

4 Function Categories (Image) Image Functions Amount Arithmetic and Logic 901 Color Conversion 416 Compression 25 Data Exchange and Initialization 695 Filtering 902 Geometry Transform 315 Morphological 98 Statistical 1523 Threshold and Compare 208 Total Amount 5083

5 Function Categories (Signal) Signal Functions Amount Arithmetic and Logic 370 Data Exchange and Initialization 108 Statistical 385 Threshold 60 Total Amount 923

6 Performance Results (Image) Image Speedup Chart Image/LogicalOperations/ImageRShif Image/ThresholdAndCompareOperati Image/StatisticsFunctionsTests/Norm Image/LogicalOperations/ImageXor_ Image/StatisticsFunctionsTests/Norm Image/StatisticsFunctionsTests/MinM Image/StatisticsFunctionsTests/Norm Image/LogicalOperations/ImageAnd_ Image/Arithmetic/ImageAdd_C3R_Te Image/FilterTests/FilterPrewittVert/ Image/LogicalOperations/ImageOr_A Image/ThresholdAndCompareOperati Image/StatisticsFunctionsTests/Norm Image/StatisticsFunctionsTests/DotP Image/FilterBorderTests/FilterGauss Image/LogicalOperations/ImageAnd_ Image/LogicalOperations/ImageOr_C Image/GeometryTransforms/Mirror/ Image/Arithmetic/ImageMul_C1R_Te Image/Arithmetic/ImageSubC_C4R_T Image/FilterBorderTests/GradientVe Image/ColorConversion/GammaCorre Image/FilterTests/Filter32f/16s_C4R Image/StatisticsFunctionsTests/MinIn Image/FilterTests/SumWindow/Colu Image/Arithmetic/ImageMulC_C1R_T Image/FilterTests/FilterLaplace/32f Image/StatisticsFunctionsTests/Mean Image/LogicalOperations/ImageOrC_ Image/FilterBorderTests/GradientVe Image/GeometryTransforms/Mirror/ Image/LogicalOperations/ImageRShif Image/Arithmetic/ImageMulCScale_C Image/FilterBorderTests/GradientVe Image/Arithmetic/ImageAddC_C3R_T Image/ThresholdAndCompareOperati Image/FilterBorderTests/FilterBorde Image/Arithmetic/ImageSubC_C4R_T Image/Arithmetic/ImageMulC_C4R_T Image/ColorProcessingTests/LUT_Cu Image/Arithmetic/ImageDiv_C4R_Te Image/Arithmetic/ImageDiv_Round_ Image/Arithmetic/ImageAddC_C1R_T Image/Arithmetic/ImageDiv_Round_ Image/Arithmetic/ImageDiv_Round_ Image/Arithmetic/ImageDiv_Round_ Image/FilterBorderTests/GradientVe Image/FilterBorderTests/GradientVe Image/Arithmetic/ImageDiv_Round_ Image/Arithmetic/ImageDiv_C1R_Te Image/Arithmetic/ImageDiv_Round_ Image/FilterBorderTests/GradientVe GPU: GTX 1080; CPU: NPP: 8.0; IPP: ; RAM: 64GB.

7 Performance Results (Signal) Signal Speedup Chart Signal/VectorInitializationTests/Se Signal/StatisticalFunctionTests/Nor Signal/Conversion/Convert/32f8u_ Signal/VectorInitializationTests/Sig Signal/Conversion/Convert/64f32s_ Signal/ArithmeticTests/SignalSubC Signal/StatisticalFunctionTests/Ma Signal/StatisticalFunctionTests/Ma Signal/Conversion/SignalThreshold Signal/ArithmeticTests/SignalAbsT Signal/ArithmeticTests/SignalSubC Signal/ArithmeticTests/SignalSubC Signal/ArithmeticTests/SignalAbsT Signal/ArithmeticTests/SignalMulT Signal/StatisticalFunctionTests/Dot Signal/ArithmeticTests/SignalMulT Signal/StatisticalFunctionTests/Nor Signal/StatisticalFunctionTests/Ma Signal/VectorInitializationTests/Se Signal/ArithmeticTests/SignalMulC Signal/ArithmeticTests/SignalAbsT Signal/VectorInitializationTests/Se Signal/StatisticalFunctionTests/Dot Signal/ArithmeticTests/SignalAddC Signal/ArithmeticTests/SignalMulT Signal/ArithmeticTests/SignalSubC Signal/ArithmeticTests/SignalAddC Signal/ArithmeticTests/SignalSqrtT Signal/StatisticalFunctionTests/Ma Signal/ArithmeticTests/SignalMulT Signal/ArithmeticTests/SignalMulC Signal/ArithmeticTests/SignalMulT Signal/ArithmeticTests/SignalMulC Signal/ArithmeticTests/SignalSubC Signal/ArithmeticTests/SignalSubC Signal/ArithmeticTests/SignalMulT Signal/ArithmeticTests/SignalDivCR Signal/StatisticalFunctionTests/Ma Signal/Conversion/Convert/16s64f_ Signal/ArithmeticTests/SignalSqrtT Signal/ArithmeticTests/SignalSqrtT Signal/Conversion/SignalThreshold Signal/ArithmeticTests/SignalMulC Signal/Conversion/Convert/16s8s_S Signal/ArithmeticTests/SignalDivCR GPU: GTX 1080; CPU: NPP: 8.0; IPP: ; RAM: 64GB.

8 GPU-based JPEG codec (coming in 9.0) Fixed bugs in jpeg routines Optimized performance JPEG decoder baseline: 1000 Mpixel/s* JPEG encoder baseline: 4000 Mpixel/s* Reference: jpegnpp sample * Numbers measured on GTX 1080

9 Batch Processing (coming in 9.0) bandwidth Resize Batch GPU: GTX Peak bandwidth: 320GB/s.

10 Batch Processing (coming in 9.0) bandwidth 250 ColorTwist Batch GPU: GTX Peak bandwidth: 320GB/s.

11 Batch Processing (coming in 9.0) bandwidth Mirror Batch GPU: GTX Peak bandwidth: 320GB/s.

12 Further Reading/Resources NPP is freely available as part of the CUDA Toolkit at Source code samples demonstrating use of the NPP library: jpegnpp cannyedgedetectornpp histequalizationnpp freeimageinteropnpp FilterBorderControlNPP Report/file bugs through Nvidia forum or bug report system.

13

Ben Baker. Sponsored by:

Ben Baker. Sponsored by: Ben Baker Sponsored by: Background Agenda GPU Computing Digital Image Processing at FamilySearch Potential GPU based solutions Performance Testing Results Conclusions and Future Work 2 CPU vs. GPU Architecture

More information

Liu Yang, Bong-Joo Jang, Sanghun Lim, Ki-Chang Kwon, Suk-Hwan Lee, Ki-Ryong Kwon 1. INTRODUCTION

Liu Yang, Bong-Joo Jang, Sanghun Lim, Ki-Chang Kwon, Suk-Hwan Lee, Ki-Ryong Kwon 1. INTRODUCTION Liu Yang, Bong-Joo Jang, Sanghun Lim, Ki-Chang Kwon, Suk-Hwan Lee, Ki-Ryong Kwon 1. INTRODUCTION 2. RELATED WORKS 3. PROPOSED WEATHER RADAR IMAGING BASED ON CUDA 3.1 Weather radar image format and generation

More information

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs 5 th International Conference on Logic and Application LAP 2016 Dubrovnik, Croatia, September 19-23, 2016 Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs

More information

CUDA-Accelerated Satellite Communication Demodulation

CUDA-Accelerated Satellite Communication Demodulation CUDA-Accelerated Satellite Communication Demodulation Renliang Zhao, Ying Liu, Liheng Jian, Zhongya Wang School of Computer and Control University of Chinese Academy of Sciences Outline Motivation Related

More information

Matthew Grossman Mentor: Rick Brownrigg

Matthew Grossman Mentor: Rick Brownrigg Matthew Grossman Mentor: Rick Brownrigg Outline What is a WMS? JOCL/OpenCL Wavelets Parallelization Implementation Results Conclusions What is a WMS? A mature and open standard to serve georeferenced imagery

More information

GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links

GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links DLR.de Chart 1 GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links Chen Tang chen.tang@dlr.de Institute of Communication and Navigation German Aerospace Center DLR.de Chart

More information

DESIGNING GAMES FOR NVIDIA GRID

DESIGNING GAMES FOR NVIDIA GRID DESIGNING GAMES FOR NVIDIA GRID BEST PRACTICES GUIDE Eric Young, DevTech Engineering Manager for GRID AGENDA Onboard Games on to NVIDIA GRID GamePad Support! Configurable Game Settings Optimizing your

More information

Exploiting the Unused Part of the Brain

Exploiting the Unused Part of the Brain Exploiting the Unused Part of the Brain Deep Learning and Emerging Technology For High Energy Physics Jean-Roch Vlimant A 10 Megapixel Camera CMS 100 Megapixel Camera CMS Detector CMS Readout Highly heterogeneous

More information

Real-time Pulsar Timing signal processing on GPUs

Real-time Pulsar Timing signal processing on GPUs Real-Time Pulsar Timing Signal Processing on GPUs Plan : Pulsar Timing Instrumentations LPC2E, CNRS Orléans - FRANCE Ismaël Cognard, Gilles Theureau, Grégory Desvignes, Cédric Viou, Dalal Ait-Allal Pulsars

More information

Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes

Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes Rachata Ausavarungnirun Joshua Landgraf Vance Miller Saugata Ghose Jayneel Gandhi Christopher J. Rossbach Onur

More information

Monte Carlo integration and event generation on GPU and their application to particle physics

Monte Carlo integration and event generation on GPU and their application to particle physics Monte Carlo integration and event generation on GPU and their application to particle physics Junichi Kanzaki (KEK) GPU2016 @ Rome, Italy Sep. 26, 2016 Motivation Increase of amount of LHC data (raw &

More information

GPU-based data analysis for Synthetic Aperture Microwave Imaging

GPU-based data analysis for Synthetic Aperture Microwave Imaging GPU-based data analysis for Synthetic Aperture Microwave Imaging 1 st IAEA Technical Meeting on Fusion Data Processing, Validation and Analysis 1 st -3 rd June 2015 J.C. Chorley 1, K.J. Brunner 1, N.A.

More information

Improving GPU Performance via Large Warps and Two-Level Warp Scheduling

Improving GPU Performance via Large Warps and Two-Level Warp Scheduling Improving GPU Performance via Large Warps and Two-Level Warp Scheduling Veynu Narasiman The University of Texas at Austin Michael Shebanow NVIDIA Chang Joo Lee Intel Rustam Miftakhutdinov The University

More information

Kandao Studio. User Guide

Kandao Studio. User Guide Kandao Studio User Guide Contents 1. Product Introduction 1.1 Function 2. Hardware Requirement 3. Directions for Use 3.1 Materials Stitching 3.1.1 Source File Export 3.1.2 Source Files Import 3.1.3 Material

More information

6 TH INTERNATIONAL CONFERENCE ON APPLIED INTERNET AND INFORMATION TECHNOLOGIES 3-4 JUNE 2016, BITOLA, R. MACEDONIA PROCEEDINGS

6 TH INTERNATIONAL CONFERENCE ON APPLIED INTERNET AND INFORMATION TECHNOLOGIES 3-4 JUNE 2016, BITOLA, R. MACEDONIA PROCEEDINGS 6 TH INTERNATIONAL CONFERENCE ON APPLIED INTERNET AND INFORMATION TECHNOLOGIES 3-4 JUNE 2016, BITOLA, R. MACEDONIA PROCEEDINGS Editor: Publisher: Prof. Pece Mitrevski, PhD Faculty of Information and Communication

More information

Track and Vertex Reconstruction on GPUs for the Mu3e Experiment

Track and Vertex Reconstruction on GPUs for the Mu3e Experiment Track and Vertex Reconstruction on GPUs for the Mu3e Experiment Dorothea vom Bruch for the Mu3e Collaboration GPU Computing in High Energy Physics, Pisa September 11th, 2014 Physikalisches Institut Heidelberg

More information

GPU-accelerated track reconstruction in the ALICE High Level Trigger

GPU-accelerated track reconstruction in the ALICE High Level Trigger GPU-accelerated track reconstruction in the ALICE High Level Trigger David Rohr for the ALICE Collaboration Frankfurt Institute for Advanced Studies CHEP 2016, San Francisco ALICE at the LHC The Large

More information

Eyedentify MMR SDK. Technical sheet. Version Eyedea Recognition, s.r.o.

Eyedentify MMR SDK. Technical sheet. Version Eyedea Recognition, s.r.o. Eyedentify MMR SDK Technical sheet Version 2.3.1 010001010111100101100101011001000110010101100001001000000 101001001100101011000110110111101100111011011100110100101 110100011010010110111101101110010001010111100101100101011

More information

Semantic Segmentation on Resource Constrained Devices

Semantic Segmentation on Resource Constrained Devices Semantic Segmentation on Resource Constrained Devices Sachin Mehta University of Washington, Seattle In collaboration with Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi Project

More information

GPU ACCELERATED DEEP LEARNING WITH CUDNN

GPU ACCELERATED DEEP LEARNING WITH CUDNN GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION

More information

Recent Advances in Simulation Techniques and Tools

Recent Advances in Simulation Techniques and Tools Recent Advances in Simulation Techniques and Tools Yuyang Li, li.yuyang(at)wustl.edu (A paper written under the guidance of Prof. Raj Jain) Download Abstract: Simulation refers to using specified kind

More information

Document downloaded from:

Document downloaded from: Document downloaded from: http://hdl.handle.net/1251/64738 This paper must be cited as: Reaño González, C.; Pérez López, F.; Silla Jiménez, F. (215). On the design of a demo for exhibiting rcuda. 15th

More information

arxiv: v3 [cs.cv] 18 Dec 2018

arxiv: v3 [cs.cv] 18 Dec 2018 Video Colorization using CNNs and Keyframes extraction: An application in saving bandwidth Ankur Singh 1 Anurag Chanani 2 Harish Karnick 3 arxiv:1812.03858v3 [cs.cv] 18 Dec 2018 Abstract In this paper,

More information

Transformation to Artificial Intelligence with MATLAB Roy Lurie, PhD Vice President of Engineering MATLAB Products

Transformation to Artificial Intelligence with MATLAB Roy Lurie, PhD Vice President of Engineering MATLAB Products Transformation to Artificial Intelligence with MATLAB Roy Lurie, PhD Vice President of Engineering MATLAB Products 2018 The MathWorks, Inc. 1 A brief history of the automobile First Commercial Gas Car

More information

Creating Intelligence at the Edge

Creating Intelligence at the Edge Creating Intelligence at the Edge Vladimir Stojanović E3S Retreat September 8, 2017 The growing importance of machine learning Page 2 Applications exploding in the cloud Huge interest to move to the edge

More information

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood

Supporting x86-64 Address Translation for 100s of GPU Lanes. Jason Power, Mark D. Hill, David A. Wood Supporting x86-64 Address Translation for 100s of GPU s Jason Power, Mark D. Hill, David A. Wood Summary Challenges: CPU&GPUs physically integrated, but logically separate; This reduces theoretical bandwidth,

More information

Application of Maxwell Equations to Human Body Modelling

Application of Maxwell Equations to Human Body Modelling Application of Maxwell Equations to Human Body Modelling Fumie Costen Room E, E0c at Sackville Street Building, fc@cs.man.ac.uk The University of Manchester, U.K. February 5, 0 Fumie Costen Room E, E0c

More information

Research on Hand Gesture Recognition Using Convolutional Neural Network

Research on Hand Gesture Recognition Using Convolutional Neural Network Research on Hand Gesture Recognition Using Convolutional Neural Network Tian Zhaoyang a, Cheng Lee Lung b a Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China E-mail address:

More information

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU

IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU IMPLEMENTATION OF SOFTWARE-BASED 2X2 MIMO LTE BASE STATION SYSTEM USING GPU Seunghak Lee (HY-SDR Research Center, Hanyang Univ., Seoul, South Korea; invincible@dsplab.hanyang.ac.kr); Chiyoung Ahn (HY-SDR

More information

DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators

DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators DASH: Deadline-Aware High-Performance Memory Scheduler for Heterogeneous Systems with Hardware Accelerators Hiroyuki Usui, Lavanya Subramanian Kevin Chang, Onur Mutlu DASH source code is available at GitHub

More information

EM Simulation of Automotive Radar Mounted in Vehicle Bumper

EM Simulation of Automotive Radar Mounted in Vehicle Bumper EM Simulation of Automotive Radar Mounted in Vehicle Bumper Abstract Trends in automotive safety are pushing radar systems to higher levels of accuracy and reliable target identification for blind spot

More information

What You ll Learn Today

What You ll Learn Today CS101 Lecture 18: Image Compression Aaron Stevens 21 October 2010 Some material form Wikimedia Commons Special thanks to John Magee and his dog 1 What You ll Learn Today Review: how big are image files?

More information

INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad

INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad INSTITUTE OF AERONAUTICAL ENGINEERING Dundigal, Hyderabad - 500 043 ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK Course Title Course Code Class Branch DIGITAL IMAGE PROCESSING A70436 IV B. Tech.

More information

escience: Pulsar searching on GPUs

escience: Pulsar searching on GPUs escience: Pulsar searching on GPUs Alessio Sclocco Ana Lucia Varbanescu Karel van der Veldt John Romein Joeri van Leeuwen Jason Hessels Rob van Nieuwpoort And many others! Netherlands escience center Science

More information

PARALLEL ALGORITHMS FOR HISTOGRAM-BASED IMAGE REGISTRATION. Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, Wolfgang Effelsberg

PARALLEL ALGORITHMS FOR HISTOGRAM-BASED IMAGE REGISTRATION. Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, Wolfgang Effelsberg This is a preliminary version of an article published by Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, and Wolfgang Effelsberg. Parallel algorithms for histogram-based image registration. Proc.

More information

MACHINE LEARNING Games and Beyond. Calvin Lin, NVIDIA

MACHINE LEARNING Games and Beyond. Calvin Lin, NVIDIA MACHINE LEARNING Games and Beyond Calvin Lin, NVIDIA THE MACHINE LEARNING ERA IS HERE And it is transforming every industry... including Game Development OVERVIEW NVIDIA Volta: An Architecture for Machine

More information

Massively Parallel Signal Processing for Wireless Communication Systems

Massively Parallel Signal Processing for Wireless Communication Systems Massively Parallel Signal Processing for Wireless Communication Systems Michael Wu, Guohui Wang, Joseph R. Cavallaro Department of ECE, Rice University Wireless Communication Systems Internet Information

More information

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations Sno Projects List IEEE 1 High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations 2 A Generalized Algorithm And Reconfigurable Architecture For Efficient And Scalable

More information

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS Bulletin of the Transilvania University of Braşov Vol. 10 (59) No. 2-2017 Series I: Engineering Sciences ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS E. HORVÁTH 1 C. POZNA 2 Á. BALLAGI 3

More information

SCAI SuperComputing Application & Innovation. Sanzio Bassini October 2017

SCAI SuperComputing Application & Innovation. Sanzio Bassini October 2017 SCAI SuperComputing Application & Innovation Sanzio Bassini October 2017 The Consortium Private non for Profit Organization Founded in 1969 by Ministry of Public Education now under the control of Ministry

More information

Neat Image. User guide. standalone application (Mac) To make images look better. Document version 8.3, 27-September-2017

Neat Image. User guide. standalone application (Mac) To make images look better. Document version 8.3, 27-September-2017 Neat Image standalone application (Mac) To make images look better. User guide Document version 8.3, 27-September-2017 Neat Image 1999-2017 Neat Image team, ABSoft. All rights reserved. Table of contents

More information

Synthetic Aperture Beamformation using the GPU

Synthetic Aperture Beamformation using the GPU Paper presented at the IEEE International Ultrasonics Symposium, Orlando, Florida, 211: Synthetic Aperture Beamformation using the GPU Jens Munk Hansen, Dana Schaa and Jørgen Arendt Jensen Center for Fast

More information

Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester

Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester www.vidyarthiplus.com Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester Electronics and Communication Engineering EC 2029 / EC 708 DIGITAL IMAGE PROCESSING (Regulation

More information

CUDA 를활용한실시간 IMAGE PROCESSING SYSTEM 구현. Chang Hee Lee

CUDA 를활용한실시간 IMAGE PROCESSING SYSTEM 구현. Chang Hee Lee 1 CUDA 를활용한실시간 IMAGE PROCESSING SYSTEM 구현 Chang Hee Lee Overview Thin film transistor(tft) LCD : Inspection Object Type of Defect Type of Inspection Instrument Brief Lighting / Focusing Optic Magnification

More information

HPC + AI. Mike Houston

HPC + AI. Mike Houston HPC + AI Mike Houston PRACTICAL DEEP LEARNING EXAMPLES Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding Speech Recognition, Speech Translation, Natural Language

More information

New Paradigm in Testing Heads & Media for HDD. Dr. Lutz Henckels September 2010

New Paradigm in Testing Heads & Media for HDD. Dr. Lutz Henckels September 2010 New Paradigm in Testing Heads & Media for HDD Dr. Lutz Henckels September 2010 1 WOW an amazing industry 40%+ per year aerial density growth Source: Coughlin Associates 2010 2 WOW an amazing industry Aerial

More information

Investigating the Post Processing of LS-DYNA in a Fully Immersive Workflow Environment

Investigating the Post Processing of LS-DYNA in a Fully Immersive Workflow Environment Investigating the Post Processing of LS-DYNA in a Fully Immersive Workflow Environment Ed Helwig 1, Facundo Del Pin 2 1 Livermore Software Technology Corporation, Livermore CA 2 Livermore Software Technology

More information

Modeling the multi-conjugate adaptive optics system of the E-ELT. Laura Schreiber Carmelo Arcidiacono Giovanni Bregoli

Modeling the multi-conjugate adaptive optics system of the E-ELT. Laura Schreiber Carmelo Arcidiacono Giovanni Bregoli Modeling the multi-conjugate adaptive optics system of the E-ELT Laura Schreiber Carmelo Arcidiacono Giovanni Bregoli MAORY E-ELT Multi Conjugate Adaptive Optics Relay Wavefront sensing based on 6 (4)

More information

S4695 A Real-Time Defocus Deblurring Method for Semiconductor Manufacturing

S4695 A Real-Time Defocus Deblurring Method for Semiconductor Manufacturing S4695 A Real-Time Defocus Deblurring Method for Semiconductor Manufacturing T. Sakuyama*, Y. Hikida*, H. Sano*, K. Taniguchi* T. Funatomi**, M. Iiyama**, M. Minoh** Dainippon Screen Mfg. Co., Ltd.* Kyoto

More information

Dr Myat Su Hlaing Asia Research Center, Yangon University, Myanmar. Data programming model for an operation based parallel image processing system

Dr Myat Su Hlaing Asia Research Center, Yangon University, Myanmar. Data programming model for an operation based parallel image processing system Name: Affiliation: Field of research: Specific Field of Study: Proposed Research Topic: Dr Myat Su Hlaing Asia Research Center, Yangon University, Myanmar Information Science and Technology Computer Science

More information

Decoding Brainwave Data using Regression

Decoding Brainwave Data using Regression Decoding Brainwave Data using Regression Justin Kilmarx: The University of Tennessee, Knoxville David Saffo: Loyola University Chicago Lucien Ng: The Chinese University of Hong Kong Mentor: Dr. Xiaopeng

More information

USING MULTIPROCESSOR SYSTEMS FOR MULTISPECTRAL DATA PROCESSING

USING MULTIPROCESSOR SYSTEMS FOR MULTISPECTRAL DATA PROCESSING U.P.B. Sci. Bull., Series C, Vol. 74, Iss. 4, 2012 ISSN 1454-234x USING MULTIPROCESSOR SYSTEMS FOR MULTISPECTRAL DATA PROCESSING Iulian NIŢĂ 1, Olga ALDEA 2 Procesarea datelor satelitare mulispectrale

More information

Warp-Aware Trace Scheduling for GPUS. James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown)

Warp-Aware Trace Scheduling for GPUS. James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown) Warp-Aware Trace Scheduling for GPUS James Jablin (Brown) Thomas Jablin (UIUC) Onur Mutlu (CMU) Maurice Herlihy (Brown) Historical Trends in GFLOPS: CPUs vs. GPUs Theoretical GFLOP/s 3250 3000 2750 2500

More information

Plane-dependent Error Diffusion on a GPU

Plane-dependent Error Diffusion on a GPU Plane-dependent Error Diffusion on a GPU Yao Zhang a, John Ludd Recker b, Robert Ulichney c, Ingeborg Tastl b, John D. Owens a a University of California, Davis, One Shields Avenue, Davis, CA, USA; b Hewlett-Packard

More information

Developing a GPU Processing Framework for Accelerating Remote Sensing Algorithms

Developing a GPU Processing Framework for Accelerating Remote Sensing Algorithms 19 October 2010 Research and Industrial Collaboration Conference Research to Reality Northeastern University, Boston, MA Developing a GPU Processing Framework for Accelerating Remote Sensing Algorithms

More information

Neat Image. User guide. plug-in for Photoshop (Mac) To make images look better. Document version 8.3, 27-September-2017

Neat Image. User guide. plug-in for Photoshop (Mac) To make images look better. Document version 8.3, 27-September-2017 Neat Image plug-in for Photoshop (Mac) To make images look better. User guide Document version 8.3, 27-September-2017 Neat Image 1999-2018 Neat Image team, ABSoft. All rights reserved. Table of contents

More information

Analysis of Image Compression Algorithm: GUETZLI

Analysis of Image Compression Algorithm: GUETZLI Analysis of Image Compression Algorithm: GUETZLI Lingyi Li August 18, 2017 Abstract How to balance picture size and quality is the core of image compression. This paper evaluates Google's jpeg image compression

More information

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices Daniele Ravì, Charence Wong, Benny Lo and Guang-Zhong Yang To appear in the proceedings of the IEEE

More information

PRACTICAL IMAGE AND VIDEO PROCESSING USING MATLAB

PRACTICAL IMAGE AND VIDEO PROCESSING USING MATLAB PRACTICAL IMAGE AND VIDEO PROCESSING USING MATLAB OGE MARQUES Florida Atlantic University *IEEE IEEE PRESS WWILEY A JOHN WILEY & SONS, INC., PUBLICATION CONTENTS LIST OF FIGURES LIST OF TABLES FOREWORD

More information

Neat Image standalone application (Win)

Neat Image standalone application (Win) Neat Image standalone application (Win) To make images look better. User guide Document version 7.6, September 26, 2014 1999-2014 by Neat Image team, ABSoft. All rights reserved. Table of contents 1. Introduction...3

More information

Counterfeit Bill Detection Algorithm using Deep Learning

Counterfeit Bill Detection Algorithm using Deep Learning Counterfeit Bill Detection Algorithm using Deep Learning Soo-Hyeon Lee 1 and Hae-Yeoun Lee 2,* 1 Undergraduate Student, 2 Professor 1,2 Department of Computer Software Engineering, Kumoh National Institute

More information

CS221 Project Final Report Deep Q-Learning on Arcade Game Assault

CS221 Project Final Report Deep Q-Learning on Arcade Game Assault CS221 Project Final Report Deep Q-Learning on Arcade Game Assault Fabian Chan (fabianc), Xueyuan Mei (xmei9), You Guan (you17) Joint-project with CS229 1 Introduction Atari 2600 Assault is a game environment

More information

Development of Image Processing Tools for Analysis of Laser Deposition Experiments

Development of Image Processing Tools for Analysis of Laser Deposition Experiments Development of Image Processing Tools for Analysis of Laser Deposition Experiments Todd Sparks Department of Mechanical and Aerospace Engineering University of Missouri, Rolla Abstract Microscopical metallography

More information

Camera Model Identification With The Use of Deep Convolutional Neural Networks

Camera Model Identification With The Use of Deep Convolutional Neural Networks Camera Model Identification With The Use of Deep Convolutional Neural Networks Amel TUAMA 2,3, Frédéric COMBY 2,3, and Marc CHAUMONT 1,2,3 (1) University of Nîmes, France (2) University Montpellier, France

More information

GPU Acceleration of the HEVC Decoder Inter Prediction Module

GPU Acceleration of the HEVC Decoder Inter Prediction Module GPU Acceleration of the HEVC Decoder Inter Prediction Module Diego F. de Souza, Aleksandar Ilic, Nuno Roma and Leonel Sousa INESC-ID, IST, Universidade de Lisboa Rua Alves Redol 9, 000-09, Lisbon, Portugal

More information

Learning to Predict Indoor Illumination from a Single Image. Chih-Hui Ho

Learning to Predict Indoor Illumination from a Single Image. Chih-Hui Ho Learning to Predict Indoor Illumination from a Single Image Chih-Hui Ho 1 Outline Introduction Method Overview LDR Panorama Light Source Detection Panorama Recentering Warp Learning From LDR Panoramas

More information

Overview of Digital Mobile Communications

Overview of Digital Mobile Communications Overview of Digital Mobile Communications Dong In Kim (dikim@ece.skku.ac.kr) Wireless Communications Lab 1 Outline Digital Communications Multiple Access Techniques Power Control for CDMA IMT-2000 System

More information

Image Manipulation Detection using Convolutional Neural Network

Image Manipulation Detection using Convolutional Neural Network Image Manipulation Detection using Convolutional Neural Network Dong-Hyun Kim 1 and Hae-Yeoun Lee 2,* 1 Graduate Student, 2 PhD, Professor 1,2 Department of Computer Software Engineering, Kumoh National

More information

Revolutionize the Service Industries with AI 2016 Service Robot

Revolutionize the Service Industries with AI 2016 Service Robot Revolutionize the Service Industries with AI 2016 Service Robot Clever-m 632 Robot Intelligence Laboratory Jonathan.Xu Standing Vice Director Outline 1 Industry Trends 2 States of Service Robot 3 Powered

More information

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey TOOLS AND PROCESSORS FOR COMPUTER VISION Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey 1 EXECUTIVE SUMMARY Since 2015, the Embedded Vision Alliance has

More information

Haptic Data Transmission based on the Prediction and Compression

Haptic Data Transmission based on the Prediction and Compression Haptic Data Transmission based on the Prediction and Compression 375 19 X Haptic Data Transmission based on the Prediction and Compression Yonghee You and Mee Young Sung Department of Computer Science

More information

An evaluation of debayering algorithms on GPU for real-time panoramic video recording

An evaluation of debayering algorithms on GPU for real-time panoramic video recording An evaluation of debayering algorithms on GPU for real-time panoramic video recording Ragnar Langseth, Vamsidhar Reddy Gaddam, Håkon Kvale Stensland, Carsten Griwodz, Pål Halvorsen University of Oslo /

More information

A BLAS-based Algorithm for Finding Position Weight Matrix Occurrences in DNA sequences on CPUs and GPUs

A BLAS-based Algorithm for Finding Position Weight Matrix Occurrences in DNA sequences on CPUs and GPUs A BLAS-based Algorithm for Finding Position Weight Matrix Occurrences in DNA sequences on CPUs and GPUs Jan Fostier IDLab, Department of Information Technology, Ghent University - imec, Ghent, Belgium

More information

On Building a Programmable Wireless High-Quality Virtual Reality System Using Commodity Hardware

On Building a Programmable Wireless High-Quality Virtual Reality System Using Commodity Hardware On Building a Programmable Wireless High-Quality Virtual Reality System Using Commodity Hardware Ruiguang Zhong, Manni Wang, Zijian Chen, Luyang Liu, Yunxin Liu, Jiansong Zhang, Lintao Zhang, Thomas Moscibroda

More information

A Polyphase Filter for GPUs and Multi-Core Processors

A Polyphase Filter for GPUs and Multi-Core Processors A Polyphase Filter for GPUs and Multi-Core Processors Karel van der Veldt Universiteit van Amsterdam The Netherlands karel.vd.veldt@uva.nl Ana Lucia Varbanescu Technische Universiteit Delft The Netherlands

More information

KIP 2300 HIGH PRODUCTION CCD SCAN SYSTEM

KIP 2300 HIGH PRODUCTION CCD SCAN SYSTEM KIP 2300 HIGH PRODUCTION CCD SCAN SYSTEM KIP 2300 CCD SCANNING SYSTEM High Production Scan System The new KIP 2300 high productivity scanner sets a uniquely high standard for speed, quality and fl exibility

More information

Global Contrast Enhancement Detection via Deep Multi-Path Network

Global Contrast Enhancement Detection via Deep Multi-Path Network Global Contrast Enhancement Detection via Deep Multi-Path Network Cong Zhang, Dawei Du, Lipeng Ke, Honggang Qi School of Computer and Control Engineering University of Chinese Academy of Sciences, Beijing,

More information

A picture is worth a thousand words

A picture is worth a thousand words Images Images Images include graphics, such as backgrounds, color schemes and navigation bars, and photos and other illustrations An essential part of a multimedia product, is present in every multimedia

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAGE COMPRESSION STANDARDS Lesson 16 Still Image Compression Standards: JBIG and JPEG Instructional Objectives At the end of this lesson, the students should be able to: 1. Explain the

More information

2010 HSC Software Design and Development Marking Guidelines

2010 HSC Software Design and Development Marking Guidelines 00 HSC Software Design and Development Marking Guidelines Section I Question Answer A A A 4 D 5 C 6 B 7 B 8 D 9 D 0 C D B B 4 D 5 A 6 B 7 C 8 D 9 C 0 C 00 HSC Software Design and Development Marking Guidelines

More information

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University

Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University Lecture 6: Electronics Beyond the Logic Switches Xufeng Kou School of Information Science and Technology ShanghaiTech University EE 224 Solid State Electronics II Lecture 3: Lattice and symmetry 1 Outline

More information

Challenges in Transition

Challenges in Transition Challenges in Transition Keynote talk at International Workshop on Software Engineering Methods for Parallel and High Performance Applications (SEM4HPC 2016) 1 Kazuaki Ishizaki IBM Research Tokyo kiszk@acm.org

More information

Implementation of Barcode Localization Technique using Morphological Operations

Implementation of Barcode Localization Technique using Morphological Operations Implementation of Barcode Localization Technique using Morphological Operations Savreet Kaur Student, Master of Technology, Department of Computer Engineering, ABSTRACT Barcode Localization is an extremely

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Deep Learning Barnabás Póczos Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey Hinton Yann LeCun 2

More information

HIGH PERFORMANCE COMPUTING USING GPGPU FOR RADAR APPLICATIONS

HIGH PERFORMANCE COMPUTING USING GPGPU FOR RADAR APPLICATIONS HIGH PERFORMANCE COMPUTING USING GPGPU FOR RADAR APPLICATIONS Viswam Gampala 1 (visgam@yahoo.co.in), Akshay BM 1, A Vengadarajan 1, PS Avadhani 2 1. Electronics & Radar Development Establishment, DRDO,

More information

PV-PPV: Parameter Variability Aware, Automatically Extracted, Nonlinear Time-Shifted Oscillator Macromodels

PV-PPV: Parameter Variability Aware, Automatically Extracted, Nonlinear Time-Shifted Oscillator Macromodels PV-PPV: Parameter Variability Aware, Automatically Extracted, Nonlinear Time-Shifted Oscillator Macromodels Zhichun Wang, Xiaolue Lai and Jaijeet Roychowdhury Dept of ECE, University of Minnesota, Twin

More information

arxiv: v1 [astro-ph.im] 1 Sep 2015

arxiv: v1 [astro-ph.im] 1 Sep 2015 Experimental Astronomy manuscript No. (will be inserted by the editor) A Real-time Coherent Dedispersion Pipeline for the Giant Metrewave Radio Telescope Kishalay De Yashwant Gupta arxiv:1509.00186v1 [astro-ph.im]

More information

Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices

Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices Thomas Debrunner - MSc Student Imperial College London Paul Kelly - Software Performance Optimisation Group Lead, Imperial College

More information

Perspective platforms for BOINC distributed computing network

Perspective platforms for BOINC distributed computing network Perspective platforms for BOINC distributed computing network Vitalii Koshura Lohika Odessa, Ukraine lestat.de.lionkur@gmail.com Profile page: https://www.linkedin.com/in/aenbleidd/ Abstract This paper

More information

Manga Studio 5 The Standard in Manga & Comic Illustration!

Manga Studio 5 The Standard in Manga & Comic Illustration! Manga Studio 5 The Standard in Manga & Comic Illustration! Manga Studio 5, is the world s leading all-in-one comic and manga creation software. Manga allows users to quickly and easily create manga and

More information

Hybrid Coding (JPEG) Image Color Transform Preparation

Hybrid Coding (JPEG) Image Color Transform Preparation Hybrid Coding (JPEG) 5/31/2007 Kompressionsverfahren: JPEG 1 Image Color Transform Preparation Example 4: 2: 2 YUV, 4: 1: 1 YUV, and YUV9 Coding Luminance (Y): brightness sampling frequency 13.5 MHz Chrominance

More information

Neural Networks The New Moore s Law

Neural Networks The New Moore s Law Neural Networks The New Moore s Law Chris Rowen, PhD, FIEEE CEO Cognite Ventures December 216 Outline Moore s Law Revisited: Efficiency Drives Productivity Embedded Neural Network Product Segments Efficiency

More information

TUBERCULIN SKIN TEST CHECKER USING DIGITAL IMAGE PROCESSING. John Marnel M. San Pedro and Davood Pour Yousefian Barfeh ABSTRACT

TUBERCULIN SKIN TEST CHECKER USING DIGITAL IMAGE PROCESSING. John Marnel M. San Pedro and Davood Pour Yousefian Barfeh ABSTRACT TUBERCULIN SKIN TEST CHECKER USING DIGITAL IMAGE PROCESSING John Marnel M. San Pedro and Davood Pour Yousefian Barfeh ABSTRACT The wheal, produced by tuberculin skin tests, was identified by nurses through

More information

Airborne radar clutter simulation using GPU (CUDA)

Airborne radar clutter simulation using GPU (CUDA) Airborne radar clutter simulation using GPU (CUDA) 1 Priyanka A P, 2 Mr.Channabasappa Baligar 1 Department of VLSI and Embedded Systems, UTL technologies Ltd, Bangalore, India 2 Department of VLSI and

More information

GPU Computing for Cognitive Robotics

GPU Computing for Cognitive Robotics GPU Computing for Cognitive Robotics Martin Peniak, Davide Marocco, Angelo Cangelosi GPU Technology Conference, San Jose, California, 25 March, 2014 Acknowledgements This study was financed by: EU Integrating

More information

Research Article ETSI-Standard Reconfigurable Mobile Device for Supporting the Licensed Shared Access

Research Article ETSI-Standard Reconfigurable Mobile Device for Supporting the Licensed Shared Access Mobile Information Systems Volume 2016, Article ID 8035876, 11 pages http://dx.doi.org/10.1155/2016/8035876 Research Article ETSI-Standard Reconfigurable Mobile Device for Supporting the Licensed Shared

More information

VISUAL CRYPTOGRAPHY for COLOR IMAGES USING ERROR DIFFUSION AND PIXEL SYNCHRONIZATION

VISUAL CRYPTOGRAPHY for COLOR IMAGES USING ERROR DIFFUSION AND PIXEL SYNCHRONIZATION VISUAL CRYPTOGRAPHY for COLOR IMAGES USING ERROR DIFFUSION AND PIXEL SYNCHRONIZATION Pankaja Patil Department of Computer Science and Engineering Gogte Institute of Technology, Belgaum, Karnataka Bharati

More information

Suitable firmware can be found on Anritsu's web site under the instrument library listings.

Suitable firmware can be found on Anritsu's web site under the instrument library listings. General Caution Please use a USB Memory Stick for firmware updates. Suitable firmware can be found on Anritsu's web site under the instrument library listings. If your existing firmware is older than v1.19,

More information

Voxengo Correlometer User Guide

Voxengo Correlometer User Guide Version 1.0 https://www.voxengo.com/product/correlometer/ Contents Introduction 3 Features 3 Compatibility 3 User Interface Elements 4 Parameters 4 What is Correlation? 4 Credits 6 Copyright 2019 Aleksey

More information

Accelerating Stochastic Random Projection Neural Networks

Accelerating Stochastic Random Projection Neural Networks Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 12-2017 Accelerating Stochastic Random Projection Neural Networks Swathika Ramakrishnan sxr1661@rit.edu Follow

More information