Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing

Size: px
Start display at page:

Download "Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing"

Transcription

1 Convolution Engine: Balancing Efficiency and Flexibility in Specialized Computing Paper by: Wajahat Qadeer Rehan Hameed Ofer Shacham Preethi Venkatesan Christos Kozyrakis Mark Horowitz Presentation by: Justin Selig Patrick Wang Shaurya Luthra

2 Convolution What is it used for Computational photography Image processing Video processing Convolutional Neural Networks Why do we care Cheap imagers on the rise Proliferating technology

3 Convolution Example Kernel: Sum

4 Convolution Example Kernel: Sum

5 Convolution Example Kernel: Sum

6 Convolution Example Kernel: Sum

7 Convolution Example Kernel: Sum

8 Convolution Example Kernel: Sum

9 Convolution Example Kernel: Sum

10 Convolution Example Kernel: Sum

11 Convolution Example This is a form of Map Reduce: Map: Multiply kernel coefficients by elements of matrix. Reduce: Compute single output from multiple operands.

12 Computation Model Basic 1D convolution of image Img with filter f Generalized to a map function Map and reduce function R with convolution size c

13 Computation Model For basic convolution Map Multiplication Reduce Summation Operation kernels define Map and Reduce for different operation types

14 Example Operations Motion Estimation (H.264) Map Absolute Difference Reduce Summation SIFT (Gaussian blur) Map Multiply Reduce Summation Up Sampling (½ pixel) Map Multiply Reduce Summation Demosaic Interpolation Map Multiply Reduce Complex...

15 Convolution What is the problem? Very computation heavy Too much energy consumption on both CPUs and GPUs

16 The Energy Cost of General Purpose Processing Source: Convolution Engine: Balancing Efficiency & Flexibility in Specialized Computing by W Qadeer, R Hameed, O Shacham, P Venkatesan, C Kozyrakis, M Horowitz. Presentation given at Stanford University

17 Flexibility vs Efficiency

18 General Requirements Hundreds of Ops per instruction (ideal 100x performance gains) Minimize data fetch Conflict? Convolution uses intermediate values

19 What about SIMD? Single Instruction Multiple Data extension Can achieve up to 10x better performance And programmable! Limited by register file structure Source: Convolution Engine: Balancing Efficiency & Flexibility in Specialized Computing by W Qadeer, R Hameed, O Shacham, P Venkatesan, C Kozyrakis, M Horowitz. Presentation given at Stanford University

20 What Makes Specialization Better Data structures are optimized for data flow and data locality requirements.

21 Convolution Engine Architecture Shift Registers: 2D Shift Registers Vertical and 2D convolution Vertical shift (shift in row of image) Simultaneous register access 1D Shift registers Data for horizontal convolution flow Coefficient Registers: 2D Registers Stores static data (filter coefficients, static pixel data) ALU/Multipliers: 128 of these allow parallel execution

22 Convolution Engine Architecture Interface Unit (IF): Parallel access to register file to arrange data into blocks accessible by functional units. Functional Units (FU): Fixed-point, two-input ALUs. Supports multiply, absolute difference, and arithmetic. Reduce Units (RU): Programmable degree of reduction for arithmetic and logical stages implemented with a combining tree.

23 Convolution Engine Architecture Map/Reduce Logic: Abstract Convolution to map and reduce step Transforms input to output pixel Map Stage ALU s work with interface units Reduce stage Programmable reduce unit implemented as a combining tree Data Shuffle Stage Flexible swizzle network that allows permutation of data between stages

24 Convolution Engine Architecture Other Hardware: 32 Element SIMD unit Interfaces with 2D output register Only intermediate ops so no data access Vector add and vector subtract ops only Sizing: Amortization of instruction costs hundreds of ops per instruction ops/instr is good enough More ALUs = diminishing returns 128 chosen to keep all ALUs busy Can power off half of ALUs and compute structures Programming: Adds instructions to processor ISA

25 Evaluation Map each target application on a chip multiprocessor 2 CEs Test against: SIMD CMPs (custom heterogeneous chip multiprocessors ) Used Tensilica s Xtensa Modeling Platform Created floorplan to account for interconnection energy

26 Conclusion Convolution Engines (CEs) are a flexible specialized processors that increases energy efficiency while maintaining programmability. CEs take advantage of data-reuse patterns, eliminate data-transfer overheads, and enable a large number of operations per cycle. CEs may support numerous applications based on convolution-like patterns. Compared to single-kernel accelerators, CE remains within 2-3x the energy and area. CEs use 8-15x less energy than SIMD engines.

27 Thanks for Listening! Questions?

28 Quiz: Convolution Use the following kernel to compute an average of the neighboring pixels OF THE TOP 2 SQUARES by convolving the filter over the given matrix. (*Don t use intermediate values for computation) /4 1/4 1/4 1/4 Kernel: Average

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. December 3-6, 2018 Santa Clara Convention Center CA, USA REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND. https://tmt.knect365.com/risc-v-summit @risc_v ACCELERATING INFERENCING ON THE EDGE WITH RISC-V

More information

Image Filtering. Median Filtering

Image Filtering. Median Filtering Image Filtering Image filtering is used to: Remove noise Sharpen contrast Highlight contours Detect edges Other uses? Image filters can be classified as linear or nonlinear. Linear filters are also know

More information

Creating Intelligence at the Edge

Creating Intelligence at the Edge Creating Intelligence at the Edge Vladimir Stojanović E3S Retreat September 8, 2017 The growing importance of machine learning Page 2 Applications exploding in the cloud Huge interest to move to the edge

More information

PLazeR. a planar laser rangefinder. Robert Ying (ry2242) Derek Xingzhou He (xh2187) Peiqian Li (pl2521) Minh Trang Nguyen (mnn2108)

PLazeR. a planar laser rangefinder. Robert Ying (ry2242) Derek Xingzhou He (xh2187) Peiqian Li (pl2521) Minh Trang Nguyen (mnn2108) PLazeR a planar laser rangefinder Robert Ying (ry2242) Derek Xingzhou He (xh2187) Peiqian Li (pl2521) Minh Trang Nguyen (mnn2108) Overview & Motivation Detecting the distance between a sensor and objects

More information

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors

An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors T.N.Priyatharshne Prof. L. Raja, M.E, (Ph.D) A. Vinodhini ME VLSI DESIGN Professor, ECE DEPT ME VLSI DESIGN

More information

Option 1: A programmable Digital (FIR) Filter

Option 1: A programmable Digital (FIR) Filter Design Project Your design project is basically a module filter. A filter is basically a weighted sum of signals. The signals (input) may be related, e.g. a delayed versions of each other in time, e.g.

More information

A High Definition Motion JPEG Encoder Based on Epuma Platform

A High Definition Motion JPEG Encoder Based on Epuma Platform Available online at www.sciencedirect.com Procedia Engineering 29 (2012) 2371 2375 2012 International Workshop on Information and Electronics Engineering (IWIEE) A High Definition Motion JPEG Encoder Based

More information

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology

A New network multiplier using modified high order encoder and optimized hybrid adder in CMOS technology Inf. Sci. Lett. 2, No. 3, 159-164 (2013) 159 Information Sciences Letters An International Journal http://dx.doi.org/10.12785/isl/020305 A New network multiplier using modified high order encoder and optimized

More information

>>> from numpy import random as r >>> I = r.rand(256,256);

>>> from numpy import random as r >>> I = r.rand(256,256); WHAT IS AN IMAGE? >>> from numpy import random as r >>> I = r.rand(256,256); Think-Pair-Share: - What is this? What does it look like? - Which values does it take? - How many values can it take? - Is it

More information

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations

Sno Projects List IEEE. High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations Sno Projects List IEEE 1 High - Throughput Finite Field Multipliers Using Redundant Basis For FPGA And ASIC Implementations 2 A Generalized Algorithm And Reconfigurable Architecture For Efficient And Scalable

More information

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION

A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION A HIGH PERFORMANCE HARDWARE ARCHITECTURE FOR HALF-PIXEL ACCURATE H.264 MOTION ESTIMATION Sinan Yalcin and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences, Sabanci University, 34956, Tuzla,

More information

A Survey on Power Reduction Techniques in FIR Filter

A Survey on Power Reduction Techniques in FIR Filter A Survey on Power Reduction Techniques in FIR Filter 1 Pooja Madhumatke, 2 Shubhangi Borkar, 3 Dinesh Katole 1, 2 Department of Computer Science & Engineering, RTMNU, Nagpur Institute of Technology Nagpur,

More information

Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture

Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture Eindhoven University of Technology MASTER Energy efficient multi-granular arithmetic in a coarse-grain reconfigurable architecture Louwers, S.T. Award date: 216 Link to publication Disclaimer This document

More information

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm

A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm A New High Speed Low Power Performance of 8- Bit Parallel Multiplier-Accumulator Using Modified Radix-2 Booth Encoded Algorithm V.Sandeep Kumar Assistant Professor, Indur Institute Of Engineering & Technology,Siddipet

More information

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm

Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Design of a High Speed FIR Filter on FPGA by Using DA-OBC Algorithm Vijay Kumar Ch 1, Leelakrishna Muthyala 1, Chitra E 2 1 Research Scholar, VLSI, SRM University, Tamilnadu, India 2 Assistant Professor,

More information

Digital Integrated CircuitDesign

Digital Integrated CircuitDesign Digital Integrated CircuitDesign Lecture 13 Building Blocks (Multipliers) Register Adder Shift Register Adib Abrishamifar EE Department IUST Acknowledgement This lecture note has been summarized and categorized

More information

>>> from numpy import random as r >>> I = r.rand(256,256);

>>> from numpy import random as r >>> I = r.rand(256,256); WHAT IS AN IMAGE? >>> from numpy import random as r >>> I = r.rand(256,256); Think-Pair-Share: - What is this? What does it look like? - Which values does it take? - How many values can it take? - Is it

More information

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018 CPSC 340: Machine Learning and Data Mining Convolutional Neural Networks Fall 2018 Admin Mike and I finish CNNs on Wednesday. After that, we will cover different topics: Mike will do a demo of training

More information

ISSN Vol.07,Issue.08, July-2015, Pages:

ISSN Vol.07,Issue.08, July-2015, Pages: ISSN 2348 2370 Vol.07,Issue.08, July-2015, Pages:1397-1402 www.ijatir.org Implementation of 64-Bit Modified Wallace MAC Based On Multi-Operand Adders MIDDE SHEKAR 1, M. SWETHA 2 1 PG Scholar, Siddartha

More information

Application of Maxwell Equations to Human Body Modelling

Application of Maxwell Equations to Human Body Modelling Application of Maxwell Equations to Human Body Modelling Fumie Costen Room E, E0c at Sackville Street Building, fc@cs.man.ac.uk The University of Manchester, U.K. February 5, 0 Fumie Costen Room E, E0c

More information

Evolution of DSP Processors. Kartik Kariya EE, IIT Bombay

Evolution of DSP Processors. Kartik Kariya EE, IIT Bombay Evolution of DSP Processors Kartik Kariya EE, IIT Bombay Agenda Expected features of DSPs Brief overview of early DSPs Multi-issue DSPs Case Study: VLIW based Processor (SPXK5) for Mobile Applications

More information

Images and Filters. EE/CSE 576 Linda Shapiro

Images and Filters. EE/CSE 576 Linda Shapiro Images and Filters EE/CSE 576 Linda Shapiro What is an image? 2 3 . We sample the image to get a discrete set of pixels with quantized values. 2. For a gray tone image there is one band F(r,c), with values

More information

Multi-core Platforms for

Multi-core Platforms for 20 JUNE 2011 Multi-core Platforms for Immersive-Audio Applications Course: Advanced Computer Architectures Teacher: Prof. Cristina Silvano Student: Silvio La Blasca 771338 Introduction on Immersive-Audio

More information

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture

Overview. 1 Trends in Microprocessor Architecture. Computer architecture. Computer architecture Overview 1 Trends in Microprocessor Architecture R05 Robert Mullins Computer architecture Scaling performance and CMOS Where have performance gains come from? Modern superscalar processors The limits of

More information

Digital Image Processing. Digital Image Fundamentals II 12 th June, 2017

Digital Image Processing. Digital Image Fundamentals II 12 th June, 2017 Digital Image Processing Digital Image Fundamentals II 12 th June, 2017 Image Enhancement Image Enhancement Types of Image Enhancement Operations Neighborhood Operations on Images Spatial Filtering Filtering

More information

Efficient Construction of SIFT Multi-Scale Image Pyramids for Embedded Robot Vision

Efficient Construction of SIFT Multi-Scale Image Pyramids for Embedded Robot Vision Efficient Construction of SIFT Multi-Scale Image Pyramids for Embedded Robot Vision Peter Andreas Entschev and Hugo Vieira Neto Graduate School of Electrical Engineering and Applied Computer Science Federal

More information

Image Manipulation: Filters and Convolutions

Image Manipulation: Filters and Convolutions Dr. Sarah Abraham University of Texas at Austin Computer Science Department Image Manipulation: Filters and Convolutions Elements of Graphics CS324e Fall 2017 Student Presentation Per-Pixel Manipulation

More information

Vector Arithmetic Logic Unit Amit Kumar Dutta JIS College of Engineering, Kalyani, WB, India

Vector Arithmetic Logic Unit Amit Kumar Dutta JIS College of Engineering, Kalyani, WB, India Vol. 2 Issue 2, December -23, pp: (75-8), Available online at: www.erpublications.com Vector Arithmetic Logic Unit Amit Kumar Dutta JIS College of Engineering, Kalyani, WB, India Abstract: Real time operation

More information

CS4617 Computer Architecture

CS4617 Computer Architecture 1/26 CS4617 Computer Architecture Lecture 2 Dr J Vaughan September 10, 2014 2/26 Amdahl s Law Speedup = Execution time for entire task without using enhancement Execution time for entire task using enhancement

More information

PARALLEL ALGORITHMS FOR HISTOGRAM-BASED IMAGE REGISTRATION. Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, Wolfgang Effelsberg

PARALLEL ALGORITHMS FOR HISTOGRAM-BASED IMAGE REGISTRATION. Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, Wolfgang Effelsberg This is a preliminary version of an article published by Benjamin Guthier, Stephan Kopf, Matthias Wichtlhuber, and Wolfgang Effelsberg. Parallel algorithms for histogram-based image registration. Proc.

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Project Background High speed multiplication is another critical function in a range of very large scale integration (VLSI) applications. Multiplications are expensive and slow

More information

IMAGE PROCESSING: AREA OPERATIONS (FILTERING)

IMAGE PROCESSING: AREA OPERATIONS (FILTERING) IMAGE PROCESSING: AREA OPERATIONS (FILTERING) N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lecture # 13 IMAGE PROCESSING: AREA OPERATIONS (FILTERING) N. C. State University

More information

[Devi*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785

[Devi*, 5(4): April, 2016] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY DESIGN OF HIGH SPEED FIR FILTER ON FPGA BY USING MULTIPLEXER ARRAY OPTIMIZATION IN DA-OBC ALGORITHM Palepu Mohan Radha Devi, Vijay

More information

Mahendra Engineering College, Namakkal, Tamilnadu, India.

Mahendra Engineering College, Namakkal, Tamilnadu, India. Implementation of Modified Booth Algorithm for Parallel MAC Stephen 1, Ravikumar. M 2 1 PG Scholar, ME (VLSI DESIGN), 2 Assistant Professor, Department ECE Mahendra Engineering College, Namakkal, Tamilnadu,

More information

Prof. Feng Liu. Winter /10/2019

Prof. Feng Liu. Winter /10/2019 Prof. Feng Liu Winter 29 http://www.cs.pdx.edu/~fliu/courses/cs4/ //29 Last Time Course overview Admin. Info Computer Vision Computer Vision at PSU Image representation Color 2 Today Filter 3 Today Filters

More information

Video Enhancement Algorithms on System on Chip

Video Enhancement Algorithms on System on Chip International Journal of Scientific and Research Publications, Volume 2, Issue 4, April 2012 1 Video Enhancement Algorithms on System on Chip Dr.Ch. Ravikumar, Dr. S.K. Srivatsa Abstract- This paper presents

More information

CIS581: Computer Vision and Computational Photography Homework: Cameras and Convolution Due: Sept. 14, 2017 at 3:00 pm

CIS581: Computer Vision and Computational Photography Homework: Cameras and Convolution Due: Sept. 14, 2017 at 3:00 pm CIS58: Computer Vision and Computational Photography Homework: Cameras and Convolution Due: Sept. 4, 207 at 3:00 pm Instructions This is an individual assignment. Individual means each student must hand

More information

Early Adopter : Multiprocessor Programming in the Undergraduate Program. NSF/TCPP Curriculum: Early Adoption at the University of Central Florida

Early Adopter : Multiprocessor Programming in the Undergraduate Program. NSF/TCPP Curriculum: Early Adoption at the University of Central Florida Early Adopter : Multiprocessor Programming in the Undergraduate Program NSF/TCPP Curriculum: Early Adoption at the University of Central Florida Narsingh Deo Damian Dechev Mahadevan Vasudevan Department

More information

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS

SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 SIGNED PIPELINED MULTIPLIER USING HIGH SPEED COMPRESSORS 1 T.Thomas Leonid, 2 M.Mary Grace Neela, and 3 Jose Anand

More information

CESEL: Flexible Crypto Acceleration. Kevin Kiningham Dan Boneh, Mark Horowitz, Philip Levis

CESEL: Flexible Crypto Acceleration. Kevin Kiningham Dan Boneh, Mark Horowitz, Philip Levis CESEL: Flexible Crypto Acceleration Kevin Kiningham Dan Boneh, Mark Horowitz, Philip Levis Cryptography Mathematical operations to secure data Fundamental for building secure systems Computationally intensive:

More information

Lecture Perspectives. Administrivia

Lecture Perspectives. Administrivia Lecture 29-30 Perspectives Administrivia Final on Friday May 18 12:30-3:30 pm» Location: 251 Hearst Gym Topics all what was covered in class. Review Session Time and Location TBA Lab and hw scores to be

More information

How a processor can permute n bits in O(1) cycles

How a processor can permute n bits in O(1) cycles How a processor can permute n bits in O(1) cycles Ruby Lee, Zhijie Shi, Xiao Yang Princeton Architecture Lab for Multimedia and Security (PALMS) Department of Electrical Engineering Princeton University

More information

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs

Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs and GPUs 5 th International Conference on Logic and Application LAP 2016 Dubrovnik, Croatia, September 19-23, 2016 Computational Efficiency of the GF and the RMF Transforms for Quaternary Logic Functions on CPUs

More information

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives

Lecture 30. Perspectives. Digital Integrated Circuits Perspectives Lecture 30 Perspectives Administrivia Final on Friday December 15 8 am Location: 251 Hearst Gym Topics all what was covered in class. Precise reading information will be posted on the web-site Review Session

More information

Using One hot Residue Number System (OHRNS) for Digital Image Processing

Using One hot Residue Number System (OHRNS) for Digital Image Processing Using One hot Residue Number System (OHRNS) for Digital Image Processing Davar Kheirandish Taleshmekaeil*, Parviz Ghorbanzadeh**, Aitak Shaddeli***, and Nahid Kianpour**** *Department of Electronic and

More information

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES

CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 69 CHAPTER 4 ANALYSIS OF LOW POWER, AREA EFFICIENT AND HIGH SPEED MULTIPLIER TOPOLOGIES 4.1 INTRODUCTION Multiplication is one of the basic functions used in digital signal processing. It requires more

More information

CS534 Introduction to Computer Vision. Linear Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University

CS534 Introduction to Computer Vision. Linear Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University CS534 Introduction to Computer Vision Linear Filters Ahmed Elgammal Dept. of Computer Science Rutgers University Outlines What are Filters Linear Filters Convolution operation Properties of Linear Filters

More information

ASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation

ASIP Solution for Implementation of H.264 Multi Resolution Motion Estimation Int. J. Communications, Network and System Sciences, 2010, 3, 453-461 doi:10.4236/ijcns.2010.35060 Published Online May 2010 (http://www.scirp.org/journal/ijcns/) ASIP Solution for Implementation of H.264

More information

Motion illusion, rotating snakes

Motion illusion, rotating snakes Motion illusion, rotating snakes Image Filtering 9/4/2 Computer Vision James Hays, Brown Graphic: unsharp mask Many slides by Derek Hoiem Next three classes: three views of filtering Image filters in spatial

More information

Filters. Materials from Prof. Klaus Mueller

Filters. Materials from Prof. Klaus Mueller Filters Materials from Prof. Klaus Mueller Think More about Pixels What exactly a pixel is in an image or on the screen? Solid square? This cannot be implemented A dot? Yes, but size matters Pixel Dots

More information

ASIC Implementation of High Speed Area Efficient Arithmetic Unit using GDI based Vedic Multiplier

ASIC Implementation of High Speed Area Efficient Arithmetic Unit using GDI based Vedic Multiplier INTERNATIONAL JOURNAL OF APPLIED RESEARCH AND TECHNOLOGY ISSN 2519-5115 RESEARCH ARTICLE ASIC Implementation of High Speed Area Efficient Arithmetic Unit using GDI based Vedic Multiplier 1 M. Sangeetha

More information

Design A Redundant Binary Multiplier Using Dual Logic Level Technique

Design A Redundant Binary Multiplier Using Dual Logic Level Technique Design A Redundant Binary Multiplier Using Dual Logic Level Technique Sreenivasa Rao Assistant Professor, Department of ECE, Santhiram Engineering College, Nandyala, A.P. Jayanthi M.Tech Scholar in VLSI,

More information

ACM Fast Image Convolutions. by: Wojciech Jarosz

ACM Fast Image Convolutions. by: Wojciech Jarosz ACM SIGGRAPH@UIUC Fast Image Convolutions by: Wojciech Jarosz Image Convolution Traditionally, image convolution is performed by what is called the sliding window approach. For each pixel in the image,

More information

A Review on Different Multiplier Techniques

A Review on Different Multiplier Techniques A Review on Different Multiplier Techniques B.Sudharani Research Scholar, Department of ECE S.V.U.College of Engineering Sri Venkateswara University Tirupati, Andhra Pradesh, India Dr.G.Sreenivasulu Professor

More information

ISSN Vol.03,Issue.02, February-2014, Pages:

ISSN Vol.03,Issue.02, February-2014, Pages: www.semargroup.org, www.ijsetr.com ISSN 2319-8885 Vol.03,Issue.02, February-2014, Pages:0239-0244 Design and Implementation of High Speed Radix 8 Multiplier using 8:2 Compressors A.M.SRINIVASA CHARYULU

More information

Real Time Image Denoising using Synchronized Bilateral Filter

Real Time Image Denoising using Synchronized Bilateral Filter Real Time Image Denoising using Synchronized Bilateral Filter Chandni C S 1, Pushpakumari R 2 PG Scholar, Dept of ECE, Prime College of Engineering, Palakkad, Kerala, India 1 Assistant Professor, Dept

More information

CS 4501: Introduction to Computer Vision. Filtering and Edge Detection

CS 4501: Introduction to Computer Vision. Filtering and Edge Detection CS 451: Introduction to Computer Vision Filtering and Edge Detection Connelly Barnes Slides from Jason Lawrence, Fei Fei Li, Juan Carlos Niebles, Misha Kazhdan, Allison Klein, Tom Funkhouser, Adam Finkelstein,

More information

High Performance Imaging Using Large Camera Arrays

High Performance Imaging Using Large Camera Arrays High Performance Imaging Using Large Camera Arrays Presentation of the original paper by Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Eino-Ville Talvala, Emilio Antunez, Adam Barth, Andrew Adams, Mark Horowitz,

More information

A Rotation-based Data Buffering Architecture for Convolution Filtering in a Field Programmable Gate Array

A Rotation-based Data Buffering Architecture for Convolution Filtering in a Field Programmable Gate Array JURNAL CMPUTER, VL 8, N 6, JUNE 2013 1411 A Rotation-based Data Buffering Architecture for Convolution iltering in a ield Programmable Gate Array Zhijian Lu College of Computer cience and Technology Harbin

More information

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing

Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing Design of a Power Optimal Reversible FIR Filter ASIC Speech Signal Processing Yelle Harika M.Tech, Joginpally B.R.Engineering College. P.N.V.M.Sastry M.S(ECE)(A.U), M.Tech(ECE), (Ph.D)ECE(JNTUH), PG DIP

More information

Midterm Examination CS 534: Computational Photography

Midterm Examination CS 534: Computational Photography Midterm Examination CS 534: Computational Photography November 3, 2015 NAME: SOLUTIONS Problem Score Max Score 1 8 2 8 3 9 4 4 5 3 6 4 7 6 8 13 9 7 10 4 11 7 12 10 13 9 14 8 Total 100 1 1. [8] What are

More information

Matlab (see Homework 1: Intro to Matlab) Linear Filters (Reading: 7.1, ) Correlation. Convolution. Linear Filtering (warm-up slide) R ij

Matlab (see Homework 1: Intro to Matlab) Linear Filters (Reading: 7.1, ) Correlation. Convolution. Linear Filtering (warm-up slide) R ij Matlab (see Homework : Intro to Matlab) Starting Matlab from Unix: matlab & OR matlab nodisplay Image representations in Matlab: Unsigned 8bit values (when first read) Values in range [, 255], = black,

More information

Performance Analysis of Multipliers in VLSI Design

Performance Analysis of Multipliers in VLSI Design Performance Analysis of Multipliers in VLSI Design Lunius Hepsiba P 1, Thangam T 2 P.G. Student (ME - VLSI Design), PSNA College of, Dindigul, Tamilnadu, India 1 Associate Professor, Dept. of ECE, PSNA

More information

The Metrics and Designs of an Arithmetic Logic Function over

The Metrics and Designs of an Arithmetic Logic Function over The Metrics and Designs of an Arithmetic Logic Function over 2002-2015 Jimmy Vallejo Department of Electrical and Computer Engineering University of Central Flida Orlando, FL 32816-2362 Abstract There

More information

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA

FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA FPGA Implementation of Wallace Tree Multiplier using CSLA / CLA Shruti Dixit 1, Praveen Kumar Pandey 2 1 Suresh Gyan Vihar University, Mahaljagtapura, Jaipur, Rajasthan, India 2 Suresh Gyan Vihar University,

More information

Understanding Neural Networks : Part II

Understanding Neural Networks : Part II TensorFlow Workshop 2018 Understanding Neural Networks Part II : Convolutional Layers and Collaborative Filters Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Convolutional

More information

Ben Baker. Sponsored by:

Ben Baker. Sponsored by: Ben Baker Sponsored by: Background Agenda GPU Computing Digital Image Processing at FamilySearch Potential GPU based solutions Performance Testing Results Conclusions and Future Work 2 CPU vs. GPU Architecture

More information

Image Deblurring and Noise Reduction in Python TJHSST Senior Research Project Computer Systems Lab

Image Deblurring and Noise Reduction in Python TJHSST Senior Research Project Computer Systems Lab Image Deblurring and Noise Reduction in Python TJHSST Senior Research Project Computer Systems Lab 2009-2010 Vincent DeVito June 16, 2010 Abstract In the world of photography and machine vision, blurry

More information

A Simple Design and Implementation of Reconfigurable Neural Networks

A Simple Design and Implementation of Reconfigurable Neural Networks A Simple Design and Implementation of Reconfigurable Neural Networks Hazem M. El-Bakry, and Nikos Mastorakis Abstract There are some problems in hardware implementation of digital combinational circuits.

More information

Optimized Image Scaling Processor using VLSI

Optimized Image Scaling Processor using VLSI Optimized Image Scaling Processor using VLSI V.Premchandran 1, Sishir Sasi.P 2, Dr.P.Poongodi 3 1, 2, 3 Department of Electronics and communication Engg, PPG Institute of Technology, Coimbatore-35, India

More information

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder

High Speed Vedic Multiplier Designs Using Novel Carry Select Adder High Speed Vedic Multiplier Designs Using Novel Carry Select Adder 1 chintakrindi Saikumar & 2 sk.sahir 1 (M.Tech) VLSI, Dept. of ECE Priyadarshini Institute of Technology & Management 2 Associate Professor,

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) STUDY ON COMPARISON OF VARIOUS MULTIPLIERS INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 0976 ISSN 0976 6464(Print)

More information

Image Processing Architectures (and their future requirements)

Image Processing Architectures (and their future requirements) Lecture 17: Image Processing Architectures (and their future requirements) Visual Computing Systems Smart phone processing resources Qualcomm snapdragon Image credit: Qualcomm Apple A7 (iphone 5s) Chipworks

More information

A Novel Approach For Designing A Low Power Parallel Prefix Adders

A Novel Approach For Designing A Low Power Parallel Prefix Adders A Novel Approach For Designing A Low Power Parallel Prefix Adders R.Chaitanyakumar M Tech student, Pragati Engineering College, Surampalem (A.P, IND). P.Sunitha Assistant Professor, Dept.of ECE Pragati

More information

Image Processing for feature extraction

Image Processing for feature extraction Image Processing for feature extraction 1 Outline Rationale for image pre-processing Gray-scale transformations Geometric transformations Local preprocessing Reading: Sonka et al 5.1, 5.2, 5.3 2 Image

More information

Design of an Efficient Edge Enhanced Image Scalar for Image Processing Applications

Design of an Efficient Edge Enhanced Image Scalar for Image Processing Applications Design of an Efficient Edge Enhanced Image Scalar for Image Processing Applications 1 Rashmi. H, 2 Suganya. S 1 PG Student [VLSI], Dept. of ECE, CMRIT, Bangalore, Karnataka, India 2 Associate Professor,

More information

02/02/10. Image Filtering. Computer Vision CS 543 / ECE 549 University of Illinois. Derek Hoiem

02/02/10. Image Filtering. Computer Vision CS 543 / ECE 549 University of Illinois. Derek Hoiem 2/2/ Image Filtering Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Questions about HW? Questions about class? Room change starting thursday: Everitt 63, same time Key ideas from last

More information

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen

Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen GIGA seminar 11.1.2010 Detector Implementations Based on Software Defined Radio for Next Generation Wireless Systems Janne Janhunen janne.janhunen@ee.oulu.fi 2 Outline Introduction Benefits and Challenges

More information

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY

[Krishna, 2(9): September, 2013] ISSN: Impact Factor: INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Design of Wallace Tree Multiplier using Compressors K.Gopi Krishna *1, B.Santhosh 2, V.Sridhar 3 gopikoleti@gmail.com Abstract

More information

Neural Networks The New Moore s Law

Neural Networks The New Moore s Law Neural Networks The New Moore s Law Chris Rowen, PhD, FIEEE CEO Cognite Ventures December 216 Outline Moore s Law Revisited: Efficiency Drives Productivity Embedded Neural Network Product Segments Efficiency

More information

Image Filtering and Gaussian Pyramids

Image Filtering and Gaussian Pyramids Image Filtering and Gaussian Pyramids CS94: Image Manipulation & Computational Photography Alexei Efros, UC Berkeley, Fall 27 Limitations of Point Processing Q: What happens if I reshuffle all pixels within

More information

International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 ISSN

International Journal of Scientific & Engineering Research, Volume 7, Issue 3, March-2016 ISSN ISSN 2229-5518 159 EFFICIENT AND ENHANCED CARRY SELECT ADDER FOR MULTIPURPOSE APPLICATIONS A.RAMESH Asst. Professor, E.C.E Department, PSCMRCET, Kothapet, Vijayawada, A.P, India. rameshavula99@gmail.com

More information

Modified Partial Product Generator for Redundant Binary Multiplier with High Modularity and Carry-Free Addition

Modified Partial Product Generator for Redundant Binary Multiplier with High Modularity and Carry-Free Addition Modified Partial Product Generator for Redundant Binary Multiplier with High Modularity and Carry-Free Addition Thoka. Babu Rao 1, G. Kishore Kumar 2 1, M. Tech in VLSI & ES, Student at Velagapudi Ramakrishna

More information

A New RNS 4-moduli Set for the Implementation of FIR Filters. Gayathri Chalivendra

A New RNS 4-moduli Set for the Implementation of FIR Filters. Gayathri Chalivendra A New RNS 4-moduli Set for the Implementation of FIR Filters by Gayathri Chalivendra A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved April 2011 by

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Lecture # 5 Image Enhancement in Spatial Domain- I ALI JAVED Lecturer SOFTWARE ENGINEERING DEPARTMENT U.E.T TAXILA Email:: ali.javed@uettaxila.edu.pk Office Room #:: 7 Presentation

More information

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm

Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm Design and Characterization of 16 Bit Multiplier Accumulator Based on Radix-2 Modified Booth Algorithm Vijay Dhar Maurya 1, Imran Ullah Khan 2 1 M.Tech Scholar, 2 Associate Professor (J), Department of

More information

AN ERROR LIMITED AREA EFFICIENT TRUNCATED MULTIPLIER FOR IMAGE COMPRESSION

AN ERROR LIMITED AREA EFFICIENT TRUNCATED MULTIPLIER FOR IMAGE COMPRESSION AN ERROR LIMITED AREA EFFICIENT TRUNCATED MULTIPLIER FOR IMAGE COMPRESSION K.Mahesh #1, M.Pushpalatha *2 #1 M.Phil.,(Scholar), Padmavani Arts and Science College. *2 Assistant Professor, Padmavani Arts

More information

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN

IJCSIET--International Journal of Computer Science information and Engg., Technologies ISSN An efficient add multiplier operator design using modified Booth recoder 1 I.K.RAMANI, 2 V L N PHANI PONNAPALLI 2 Assistant Professor 1,2 PYDAH COLLEGE OF ENGINEERING & TECHNOLOGY, Visakhapatnam,AP, India.

More information

Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices

Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices Automatic Kernel Code Generation for Focal-plane Sensor-Processor Devices Thomas Debrunner - MSc Student Imperial College London Paul Kelly - Software Performance Optimisation Group Lead, Imperial College

More information

Low-Power Multipliers with Data Wordlength Reduction

Low-Power Multipliers with Data Wordlength Reduction Low-Power Multipliers with Data Wordlength Reduction Kyungtae Han, Brian L. Evans, and Earl E. Swartzlander, Jr. Dept. of Electrical and Computer Engineering The University of Texas at Austin Austin, TX

More information

AN EFFICIENT ALGORITHM FOR THE REMOVAL OF IMPULSE NOISE IN IMAGES USING BLACKFIN PROCESSOR

AN EFFICIENT ALGORITHM FOR THE REMOVAL OF IMPULSE NOISE IN IMAGES USING BLACKFIN PROCESSOR AN EFFICIENT ALGORITHM FOR THE REMOVAL OF IMPULSE NOISE IN IMAGES USING BLACKFIN PROCESSOR S. Preethi 1, Ms. K. Subhashini 2 1 M.E/Embedded System Technologies, 2 Assistant professor Sri Sai Ram Engineering

More information

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2)

Lecture Topics. Announcements. Today: Pipelined Processors (P&H ) Next: continued. Milestone #4 (due 2/23) Milestone #5 (due 3/2) Lecture Topics Today: Pipelined Processors (P&H 4.5-4.10) Next: continued 1 Announcements Milestone #4 (due 2/23) Milestone #5 (due 3/2) 2 1 ISA Implementations Three different strategies: single-cycle

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK DESIGN AND IMPLEMENTATION OF TRUNCATED MULTIPLIER FOR DSP APPLICATIONS AKASH D.

More information

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers

High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers High performance Radix-16 Booth Partial Product Generator for 64-bit Binary Multipliers Dharmapuri Ranga Rajini 1 M.Ramana Reddy 2 rangarajini.d@gmail.com 1 ramanareddy055@gmail.com 2 1 PG Scholar, Dept

More information

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm

Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm Multiplier Design and Performance Estimation with Distributed Arithmetic Algorithm M. Suhasini, K. Prabhu Kumar & P. Srinivas Department of Electronics & Comm. Engineering, Nimra College of Engineering

More information

Ajmer, Sikar Road Ajmer,Rajasthan,India. Ajmer, Sikar Road Ajmer,Rajasthan,India.

Ajmer, Sikar Road Ajmer,Rajasthan,India. Ajmer, Sikar Road Ajmer,Rajasthan,India. DESIGN AND IMPLEMENTATION OF MAC UNIT FOR DSP APPLICATIONS USING VERILOG HDL Amit kumar 1 Nidhi Verma 2 amitjaiswalec162icfai@gmail.com 1 verma.nidhi17@gmail.com 2 1 PG Scholar, VLSI, Bhagwant University

More information

A Compact Design of 8X8 Bit Vedic Multiplier Using Reversible Logic Based Compressor

A Compact Design of 8X8 Bit Vedic Multiplier Using Reversible Logic Based Compressor A Compact Design of 8X8 Bit Vedic Multiplier Using Reversible Logic Based Compressor 1 Viswanath Gowthami, 2 B.Govardhana, 3 Madanna, 1 PG Scholar, Dept of VLSI System Design, Geethanajali college of engineering

More information

Multimedia Systems Giorgio Leonardi A.A Lectures 14-16: Raster images processing and filters

Multimedia Systems Giorgio Leonardi A.A Lectures 14-16: Raster images processing and filters Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lectures 14-16: Raster images processing and filters Outline (of the following lectures) Light and color processing/correction Convolution filters: blurring,

More information

Circular averaging filter (pillbox) Approximates the two-dimensional Laplacian operator. Laplacian of Gaussian filter

Circular averaging filter (pillbox) Approximates the two-dimensional Laplacian operator. Laplacian of Gaussian filter Image Processing Toolbox fspecial Create predefined 2-D filter Syntax h = fspecial( type) h = fspecial( type,parameters) Description h = fspecial( type) creates a two-dimensional filter h of the specified

More information

Design and Implementation of Wallace Tree Multiplier Using Kogge Stone Adder and Brent Kung Adder

Design and Implementation of Wallace Tree Multiplier Using Kogge Stone Adder and Brent Kung Adder International Journal of Emerging Engineering Research and Technology Volume 3, Issue 8, August 2015, PP 110-116 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Design and Implementation of Wallace Tree

More information