IHV means Independent Hardware Vendor. Example is Qualcomm Technologies Inc. that makes Snapdragon processors. OEM means Original Equipment

Similar documents
Image Processing Architectures (and their future requirements)

Image Processing Architectures (and their future requirements)

Enabling Mobile Virtual Reality ARM 助力移动 VR 产业腾飞

Take Mobile Imaging to the Next Level

The A6000 is one of Sony's best selling mirrorless cameras, even with its successor

T I P S F O R I M P R O V I N G I M A G E Q U A L I T Y O N O Z O F O O T A G E

multiframe visual-inertial blur estimation and removal for unmodified smartphones

Kornél Lehőcz Software development consultant

Vision with Precision Webinar Series Augmented & Virtual Reality Aaron Behman, Xilinx Mark Beccue, Tractica. Copyright 2016 Xilinx

ALMALENCE SUPER SENSOR. A software component with an effect of increasing the pixel size and number of pixels in the sensor

Dealing with the Complexities of Camera ISP Tuning

NanEye GS NanEye GS Stereo. Camera System

Virtual Reality Mobile 360 Nanodegree Syllabus (nd106)

High Performance Imaging Using Large Camera Arrays

Kornél Lehőcz

FCam: An architecture for computational cameras

OCTOBER Driving the New Era of Immersive Experiences

Visione per il veicolo Paolo Medici 2017/ Visual Perception

High dynamic range imaging and tonemapping

Part Number SuperPix TM image sensor is one of SuperPix TM 2 Mega Digital image sensor series products. These series sensors have the same maximum ima

Exploring Computation- Communication Tradeoffs in Camera Systems

Next Generation Biometric Sensing in Wearable Devices

The Latest High-Speed Imaging Technologies and Applications

TRIPLE CAMERAS: ARE THREE BETTER THAN TWO?

Considerations for Standardization of VR Display. Suk-Ju Kang, Sogang University

Improved sensitivity high-definition interline CCD using the KODAK TRUESENSE Color Filter Pattern

Next-generation automotive image processing with ARM Mali-C71

SpheroCam HDR. Image based lighting with. Capture light perfectly SPHERON VR. 0s 20s 40s 60s 80s 100s 120s. Spheron VR AG

Oculus Rift Getting Started Guide

Kandao Studio. User Guide

Product Introduction

IMAGE FUSION. How to Best Utilize Dual Cameras for Enhanced Image Quality. Corephotonics White Paper

The Xbox One System on a Chip and Kinect Sensor

Oculus Rift Getting Started Guide

Canon 5d Mark Ii User Manual Video Mode. Camera >>>CLICK HERE<<<

products PC Control

23270: AUGMENTED REALITY FOR NAVIGATION AND INFORMATIONAL ADAS. Sergii Bykov Technical Lead Machine Learning 12 Oct 2017

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Power of Realtime 3D-Rendering. Raja Koduri

CORRECTED VISION. Here be underscores THE ROLE OF CAMERA AND LENS PARAMETERS IN REAL-WORLD MEASUREMENT

White Paper: Compression Advantages of Pixim s Digital Pixel System Technology

Phase One 190MP Aerial System

Image Sensor and Camera Technology November 2016 in Stuttgart

Roadblocks for building mobile AR apps

CMOS MT9D112 Camera Module 1/4-Inch 3-Megapixel Module Datasheet

Camera Image Processing Pipeline: Part II

TIPA Camera Test. How we test a camera for TIPA

A 4 Megapixel camera with 6.5μm pixels, Prime BSI captures highly. event goes undetected.

CMOS Star Tracker: Camera Calibration Procedures

The Denali-MC HDR ISP Backgrounder

Working with your Camera

Applications of Flash and No-Flash Image Pairs in Mobile Phone Photography

HDR videos acquisition

Basic Camera Concepts. How to properly utilize your camera

Film Cameras Digital SLR Cameras Point and Shoot Bridge Compact Mirror less

Basler. Line Scan Cameras

Lightroom 5.2 Information about the latest Lightroom update

A NOVEL VISION SYSTEM-ON-CHIP FOR EMBEDDED IMAGE ACQUISITION AND PROCESSING

pco.edge 4.2 LT 0.8 electrons 2048 x 2048 pixel 40 fps up to :1 up to 82 % pco. low noise high resolution high speed high dynamic range

Camera Image Processing Pipeline: Part II

CMOS OV7725 Camera Module 1/4-Inch 0.3-Megapixel Module Datasheet

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey

Track and Vertex Reconstruction on GPUs for the Mu3e Experiment

Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs)

Oculus Rift Introduction Guide. Version

Automotive In-cabin Sensing Solutions. Nicolas Roux September 19th, 2018

GPU-accelerated track reconstruction in the ALICE High Level Trigger

IEEE P1858 CPIQ Overview

Basler. Line Scan Cameras

Figure 1 HDR image fusion example

Video Quality Enhancement

TOOLS & PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Fall 2017 Computer Vision Developer Survey

Diving into VR World with Oculus. Homin Lee Software Engineer at Oculus

HDR Show & Tell Image / Workflow Review Session. Dave Curtin Nassau County Camera Club October 3 rd, 2016

Software ISP Application Note

Produce stunning. Pro photographer Chris Humphreys guides you through HDR and how to create captivating natural-looking images

TOOLS & PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Computer Vision Developer Survey

CELL PHONE PHOTOGRAPHY

Difrotec Product & Services. Ultra high accuracy interferometry & custom optical solutions

Lenses, exposure, and (de)focus

OS3D-FG MINIATURE ATTITUDE & HEADING REFERENCE SYSTEM MINIATURE 3D ORIENTATION SENSOR OS3D-P. Datasheet Rev OS3D-FG Datasheet rev. 2.

Recent Advances in Simulation Techniques and Tools

brief history of photography foveon X3 imager technology description

AF Area Mode. Face Priority

Table 1: Example Implementation Statistics for Xilinx FPGAs. Fmax (MHz) LUT FF IOB RAMB36 RAMB18 DSP48

FiLMiC Log - Technical White Paper. rev 1 - current as of FiLMiC Pro ios v6.0. FiLMiCInc copyright 2017, All Rights Reserved

A Short History of Using Cameras for Weld Monitoring

VR-OOS System Architecture Workshop zu interaktiven VR-Technologien für On-Orbit Servicing

Product Requirements Document

Speed and Image Brightness uniformity of telecentric lenses

Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs)

Introduction to Virtual Reality (based on a talk by Bill Mark)

Video Quality Enhancement

Revision History. VX GigE series. Version Date Description

Mobile Virtual Reality what is that and how it works? Alexey Rybakov, Senior Engineer, Technical Evangelist at DataArt

TIK: a time domain continuous imaging testbed using conventional still images and video

LOOKING AHEAD: UE4 VR Roadmap. Nick Whiting Technical Director VR / AR

VGA CMOS Image Sensor BF3905CS

On Camera Flash. Daniel Foley

RPAS Photogrammetric Mapping Workflow and Accuracy

Transcription:

1

2

IHV means Independent Hardware Vendor. Example is Qualcomm Technologies Inc. that makes Snapdragon processors. OEM means Original Equipment Manufacturer. Examples are smartphone manufacturers. Tuning applications to specific hardware is not unique to mobile GPUs; even for discrete GPUs, architecture specific tuning is often required 3

For smartphone market, camera applications were the first to adapt GPU Compute (a.k.a., GPGPU), and for many years remained as the dominant use case for mobile GPU Compute. Primary use cases for camera have been noise reduction such as spatial or temporal denoising, chroma aberration correction, radial noise reduction as well as lens shading correction. Image stabilization for camera and drone applications is one of the most widely used non-rendering GPU application. OpenGL ES is the main API in such case. Image stabilization may include rolling shutter effect removal. Video post-processing is also a key use case where the algorithm tries to remove artifacts or enhance details after scaling as well as improving color for better user experience. Many smartphone OEMs provide their own applications in order to provide differentiating image quality enhancement features. Recently with popularity of VR, many 360 cameras are gaining traction. These cameras either apply stitching algorithm at runtime or use post-processing on smartphone or PC after capture. In some cases, stitching is combined with dewarping during playback. HDR means High Dynamic Range. This is an overloaded term; in this context, it applies to enhancing dynamic range during video capture for security camera 4

products. Security cameras do not have ability to control the lighting conditions, but need to be able to record objects that are in shadow clearly, for obvious reasons. 4

5

6

7

Typically, raw data from camera sensor is streamed directly to an ISP (Image Signal Processor) which is a hardware unit that performs series of image processing and color conversion to produce visually appealing images. For new sensors that support HDR features, the raw data needs to be preprocessed prior to being streamed to ISP, in order to combine the long and short exposure frames. There are many variations on sensor s HDR features, depending on sensor manufactures. Sensor manufactures keep advancing this technology in order to yield better solution that reduces motion artifacts, etc. In future ISP hardware, HDR processing could possibly become built-in, which removes the requirement of the additional software stage. 8

Some of these techniques are standard optimization techniques that benefit many GPU Compute use cases. 9

10

UHD is 3840 2160, aka 4K Running at nominal clock simply means to run at default mode which allows the SoC to dynamically adjust clock frequency and voltage of the hardware modules in order to yield best performance/watt outcome. Typically, this means running the clock at much lower than peak level. 11

The purpose of this page is to show that for each stage, the developer needs to identify data packing requirements for input and output. 12

Shown here is the most natural and easy way of combining kernels, by paying attention to data grouping requirements. This allows kernels to be combined without requiring additional usage of registers, meaning the register usage of the combined kernel will not be more than the registers used by either of the original kernels. 13

Barrier synchronization is required to ensure that all processing from the first stage is completed before starting the next stage. This synchronization often comes with a hidden cost, which is the latency required to reach the synchronization point across all work items working on the first stage. 14

For 2D workgroup, its shape (e.g., width vs height) is as important as the total size. Upper limit for the workgroup size of a particular kernel is determined by a number of factors including register usage (which is related to the complexity of the kernel) and presence of barrier instructions. local_work_size=null is an OpenCL feature. For OpenGL ES, the work group size needs to be specified in the compute shader. If the application needs to run on multiple devices, it is important to try different devices as well. 15

Here, we are showing comparisons of two solutions, 1-kernel which is the uber kernel that combines all stages into a single kernel using local memory, and 2- kernel which does not use local memory, and requires writing intermediate data to DDR. Latency comparison chart is showing the performance of these two solutions, normalized to the 1-kernel case. Performance measurement includes memory reads and writes and any software overhead for launching GPU kernels. Power comparison chart is showing the power consumption at battery, normalized to the 1-kernel case. These two charts show that the 1-kernel case is more power efficient but has lower performance compared to the 2-kernel case, likely due to having lower parallelism from higher register usage. 16

17