Convolutional Networks Overview

Similar documents
Introduction to Machine Learning

Coursework 2. MLP Lecture 7 Convolutional Networks 1

Deep Learning. Dr. Johan Hagelbäck.

Lecture 17 Convolutional Neural Networks

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland

CSC 578 Neural Networks and Deep Learning

CSC321 Lecture 11: Convolutional Networks

>>> from numpy import random as r >>> I = r.rand(256,256);

Convolutional neural networks

6. Convolutional Neural Networks

>>> from numpy import random as r >>> I = r.rand(256,256);

Biologically Inspired Computation

CS534 Introduction to Computer Vision. Linear Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

Image Manipulation Detection using Convolutional Neural Network

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET

Generating an appropriate sound for a video using WaveNet.

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

Digital Image Processing. Digital Image Fundamentals II 12 th June, 2017

Lecture 11-1 CNN introduction. Sung Kim

IMAGE ENHANCEMENT IN SPATIAL DOMAIN

Introduction to DSP ECE-S352 Fall Quarter 2000 Matlab Project 1

Research on Hand Gesture Recognition Using Convolutional Neural Network

CS 7643: Deep Learning

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks Based on Decompressed Images

Announcements. Image Processing. What s an image? Images as functions. Image processing. What s a digital image?

Convolutional Neural Networks

Images and Filters. EE/CSE 576 Linda Shapiro

CSCI 1290: Comp Photo

Filtering Images in the Spatial Domain Chapter 3b G&W. Ross Whitaker (modified by Guido Gerig) School of Computing University of Utah

Image Searches, Abstraction, Invariance : Data Mining 8 September 2008

Convolutional Neural Networks: Real Time Emotion Recognition

Image Filtering and Gaussian Pyramids

10mW CMOS Retina and Classifier for Handheld, 1000Images/s Optical Character Recognition System

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

arxiv: v3 [cs.cv] 18 Dec 2018

Practical Image and Video Processing Using MATLAB

Digital images. Digital Image Processing Fundamentals. Digital images. Varieties of digital images. Dr. Edmund Lam. ELEC4245: Digital Image Processing

Vision Review: Image Processing. Course web page:

Filtering in the spatial domain (Spatial Filtering)

Image Searches, Abstraction, Invariance : Data Mining 2 September 2009

Filters. Materials from Prof. Klaus Mueller

Understanding Neural Networks : Part II

Image preprocessing in spatial domain

Image Sampling. Moire patterns. - Source: F. Durand

Design of Practical Color Filter Array Interpolation Algorithms for Cameras, Part 2

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB

Image Enhancement in spatial domain. Digital Image Processing GW Chapter 3 from Section (pag 110) Part 2: Filtering in spatial domain

Image Processing COS 426

Prof. Feng Liu. Winter /10/2019

Number Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 -

PLazeR. a planar laser rangefinder. Robert Ying (ry2242) Derek Xingzhou He (xh2187) Peiqian Li (pl2521) Minh Trang Nguyen (mnn2108)

Motion illusion, rotating snakes

Image Filtering. Median Filtering

Lecture 1: image display and representation

To Do. Advanced Computer Graphics. Image Compositing. Digital Image Compositing. Outline. Blue Screen Matting

Image Processing. Adrien Treuille

Automated Planetary Terrain Mapping of Mars Using Image Pattern Recognition

Statistical Tests: More Complicated Discriminants

Deformable Convolutional Networks

Preparing Remote Sensing Data for Natural Resources Mapping (image enhancement, rectifications )

Deblurring. Basics, Problem definition and variants

Libyan Licenses Plate Recognition Using Template Matching Method

INFORMATION about image authenticity can be used in

Multiple-Layer Networks. and. Backpropagation Algorithms

02/02/10. Image Filtering. Computer Vision CS 543 / ECE 549 University of Illinois. Derek Hoiem

Motivation: Image denoising. How can we reduce noise in a photograph?

Image Filtering in Spatial domain. Computer Vision Jia-Bin Huang, Virginia Tech

Sampling and Reconstruction

Remote Sensing 4113 Lab 08: Filtering and Principal Components Mar. 28, 2018

A Spatial Mean and Median Filter For Noise Removal in Digital Images

Image Processing by Bilateral Filtering Method

Antennas and Propagation. Chapter 5c: Array Signal Processing and Parametric Estimation Techniques

arxiv: v1 [cs.ce] 9 Jan 2018

Digital Image Processing. Lecture 5 (Enhancement) Bu-Ali Sina University Computer Engineering Dep. Fall 2009

Overview. Pinhole camera model Projective geometry Vanishing points and lines Projection matrix Cameras with Lenses Color Digital image

More image filtering , , Computational Photography Fall 2017, Lecture 4

Convolution Pyramids. Zeev Farbman, Raanan Fattal and Dani Lischinski SIGGRAPH Asia Conference (2011) Julian Steil. Prof. Dr.

Computer Graphics (Fall 2011) Outline. CS 184 Guest Lecture: Sampling and Reconstruction Ravi Ramamoorthi

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

The Game-Theoretic Approach to Machine Learning and Adaptation

8.2 IMAGE PROCESSING VERSUS IMAGE ANALYSIS Image processing: The collection of routines and

GPU ACCELERATED DEEP LEARNING WITH CUDNN

PRACTICAL IMAGE AND VIDEO PROCESSING USING MATLAB

Midterm is on Thursday!

Last Lecture. photomatix.com

Image Enhancement using Histogram Equalization and Spatial Filtering

Overview. Neighborhood Filters. Dithering

Image Enhancement in the Spatial Domain Low and High Pass Filtering

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3

Multiple Kernels for Object Detection. Andrea Vedaldi Varun Gulshan Manik Varma Andrew Zisserman

Tonemapping and bilateral filtering

Image Processing Computer Graphics I Lecture 20. Display Color Models Filters Dithering Image Compression

Convolutional Neural Networks for Small-footprint Keyword Spotting

CS 4501: Introduction to Computer Vision. Filtering and Edge Detection

Toward Non-stationary Blind Image Deblurring: Models and Techniques

Sampling and Reconstruction

Transcription:

Convolutional Networks Overview Sargur Srihari 1

Topics Limitations of Conventional Neural Networks The convolution operation Convolutional Networks Pooling Convolutional Network Architecture Advantages of CNN architectures 2

Limitations of Neural Networks Need substantial number of training samples Slow learning (convergence times) Inadequate parameter selection techniques that lead to poor minima Network should exhibit invariance to translation, scaling and elastic deformations A large training set can take care of this It ignores a key property of images Nearby pixels are more strongly correlated than distant ones Modern computer vision approaches exploit this property Information can be merged at later stages to get higher order features and about whole image 3

Three Mechanisms of Convolutional Neural Networks 1. Local Receptive Fields 2. Subsampling 3. Weight Sharing 4

What is Convolution? One-dimensional continuous case Input f(t) is convolved with a kernel g(t) (f * g)(t) f (τ)g(t τ)dτ Note that (f * g )(t)=(g * f )(t) 1.Express each function in terms of a dummy variable τ 2. Reflect one of the functions g(τ)àg(-τ) 3. Add a time offset t, which allows g(t-τ) to slide along the τ axis 4. Start t at - and slide it all the way to+ Wherever the two functions intersect find the integral of their product https://en.wikipedia.org 5

Convolution in discrete case Here we have discrete functions f and g (f * g)[t] = f[τ] g[t τ] τ= f [t ] g [t-τ ] 6

Computation of 1-D discrete convolution Parameters of convolution: Kernel size (F) Padding (P) Stride (S) (f *g)[t] g[t-τ] f [t] 7

Neural network for 1-D convolution f [t] Equations for outputs of this network: Kernel g(t): etc. upto y 8 We can also write the equations in terms of elements of a general 8 8 weight matrix W as: where http://colah.github.io/posts/2014-07-understanding-convolutions/ 8

Machine Learning 2-D Convolution Srihari Kernel for blurring Neighborhood average Kernel for edge detection Kernels for line detection Neighborhood difference 9

Machine Learning Srihari Sparse connectivity due to Image Convolution Input image may have millions of pixels, But we can detect edges with kernels of hundreds of pixels If we limit no of connections for each input to k we need kxn parameters and O(k n) runtime It is possible to get good performance with k<<n Convolutional networks have sparse interactions Accomplished by making the kernel smaller than the input Next slide shows graphical depiction 10

Traditional vs Convolutional Networks Traditional neural network layers use matrix multiplication by a matrix of parameters with a separate parameter describing the interaction between each input unit and each output unit s =g(w T x ) With m inputs and n outputs, matrix multiplication requires mxn parameters and O(m n) runtime per example This means every output unit interacts with every input unit Convolutional network layers have sparse interactions If we limit no of connections for each input to k we need k x n parameters and O(k n) runtime 11

Views of sparsity of CNN vs full connectivity Sparsity viewed from below Sparsity viewed from above Highlight one input x 3 and output units s affected by it Top: when s is formed by convolution with a kernel of width 3, only three outputs are affected by x 3 Bottom: when s is formed by matrix multiplication connectivity is no longer sparse Highlight one output s 3 and inputs x that affect this unit These units are known as the receptive field of s 3 So all outputs are affected by x 3 12

Pooling A key aspect of Convolutional Neural Networks are pooling layers Typically applied after the convolutional layers. A pooling function replaces the output of the net at a certain location with a summary statistic of the nearby inputs Pooling layers subsample their input Example on next slide 13

Pooling functions Popular pooling functions are: 1. max pooling operation reports the maximum output within a rectangular neighborhood 6,8,3,4 are the maximum values in each of the 2 2 regions of same color 2. Average of a rectangular neighborhood 3. L 2 norm of a rectangular neighborhood 4. Weighted average based on the distance from the central pixel 14

Why pooling? It provides a fixed size output matrix, which typically is required for classification. E.g., with 1,000 filters and max pooling to each, we get a 1000- dimensional output, regardless of the size of filters, or size of input This allows you to use variable size sentences, and variable size filters, but always get the same output dimensions to feed into a classifier Pooling also provides basic invariance to translating (shifting) and rotation When pooling over a region, output will stay approximately the same even if you shift/rotate the image by a few pixels because the max operations will pick out the same value regardless 15

Max pooling introduces invariance to translation View of middle of output of a convolutional layer Outputs of maxpooling Outputs of nonlinearity Same network after the input has been shifted by one pixel Every input value has changed, but only half the values of output have changed because maxpooling units are only 16 sensitive to maximum value in neighborhood not exact value

Convolutional Network Architecture Three kernels Pooling Reduces size Six kernels 17

Convolution and Sub-sampling Instead of treating input to a fully connected network Two layers of Neural networks are used 1. Layer of convolutional units which consider overlapping regions 2. Layer of subsampling units Also called pooling Several feature maps and sub-sampling Gradual reduction of spatial resolution compensated by increasing no. of features Final layer has softmax output Whole network trained using backpropagation Including those for convolution and subsampling Input image 5 x 5 pixels Each pixel patch is 5 x 5 10 x 10 units 2 x 2 units 5 x 5 units This plane has 10 10=100 neural network units (called a feature map). Weights are same for different planes. So only 25 weights are needed. Due to weight sharing this is equivalent to convolution. Different features have different feature maps 18

Two layers of convolution and sub-sampling 1. Convolve Input image with three trainable filters and biases to produce three feature maps at the C1 level 2. Each group of four pixels in the feature maps are added, weighted, combined with a bias, and passed through a sigmoid to produce feature maps at S2. 3. These are again filtered to produce the C3 level. 4. The hierarchy then produces S4 in a manner analogous to S2 5. Finally, rasterized pixel values are presented as a vector to a conventional neural network 19

Two layers of convolution and sub-sampling By weight sharing, invariance to small transformations (translation, rotation achieved) Regularization Similar to biological networks Local receptive fields Smart way of reducing dimensionality before applying a full neural network 20

Advantages of Convolutional Network Architecture Minimize computation compared to a regular neural network Convolution simplifies computation to a great extent without losing the essence of the data They are great at handling image classification They use the same knowledge across all image locations 21