Coursework 2. MLP Lecture 7 Convolutional Networks 1

Similar documents
Convolutional neural networks

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland

Convolutional Networks Overview

CSC 578 Neural Networks and Deep Learning

CSC321 Lecture 11: Convolutional Networks

Deep Learning. Dr. Johan Hagelbäck.

Introduction to Machine Learning

Are there alternatives to Sigmoid Hidden Units? MLP Lecture 6 Hidden Units / Initialisation 1

Recurrent neural networks Modelling sequential data. MLP Lecture 9 Recurrent Neural Networks 1: Modelling sequential data 1

Recurrent neural networks Modelling sequential data. MLP Lecture 9 Recurrent Networks 1

6. Convolutional Neural Networks

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks Based on Decompressed Images

CS 7643: Deep Learning

Convolutional Neural Networks

Author(s) Corr, Philip J.; Silvestre, Guenole C.; Bleakley, Christopher J. The Irish Pattern Recognition & Classification Society

Lecture 11-1 CNN introduction. Sung Kim

Research on Hand Gesture Recognition Using Convolutional Neural Network

Biologically Inspired Computation

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation

Deep Learning Basics Lecture 9: Recurrent Neural Networks. Princeton University COS 495 Instructor: Yingyu Liang

Generating an appropriate sound for a video using WaveNet.

Landmark Recognition with Deep Learning

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET

CS534 Introduction to Computer Vision. Linear Filters. Ahmed Elgammal Dept. of Computer Science Rutgers University

arxiv: v1 [cs.ce] 9 Jan 2018

Recurrent neural networks Modelling sequential data. MLP Lecture 9 / 13 November 2018 Recurrent Neural Networks 1: Modelling sequential data 1

Number Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices

CS 4501: Introduction to Computer Vision. Filtering and Edge Detection

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems

Lecture 17 Convolutional Neural Networks

Impact of Automatic Feature Extraction in Deep Learning Architecture

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

>>> from numpy import random as r >>> I = r.rand(256,256);

Semantic Segmentation on Resource Constrained Devices

Neural Network Part 4: Recurrent Neural Networks

MLP for Adaptive Postprocessing Block-Coded Images

INFORMATION about image authenticity can be used in

A Vision Based Hand Gesture Recognition System using Convolutional Neural Networks

Image Manipulation Detection using Convolutional Neural Network

Convolutional Neural Networks for Small-footprint Keyword Spotting

Comparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics

Convolutional Networks for Images, Speech, and. Time-Series. 101 Crawfords Corner Road Operationnelle, Universite de Montreal,

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 -

Convolution Pyramids. Zeev Farbman, Raanan Fattal and Dani Lischinski SIGGRAPH Asia Conference (2011) Julian Steil. Prof. Dr.

Sketch-a-Net that Beats Humans

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION

>>> from numpy import random as r >>> I = r.rand(256,256);

Announcements. Image Processing. What s an image? Images as functions. Image processing. What s a digital image?

CIS581: Computer Vision and Computational Photography Homework: Cameras and Convolution Due: Sept. 14, 2017 at 3:00 pm

GPU ACCELERATED DEEP LEARNING WITH CUDNN

A Neural Algorithm of Artistic Style (2015)

arxiv: v2 [cs.lg] 7 May 2017

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks

Matlab (see Homework 1: Intro to Matlab) Linear Filters (Reading: 7.1, ) Correlation. Convolution. Linear Filtering (warm-up slide) R ij

Remote Sensing 4113 Lab 08: Filtering and Principal Components Mar. 28, 2018

Understanding Neural Networks : Part II

Image Recognition of Tea Leaf Diseases Based on Convolutional Neural Network

Digital Image Processing. Digital Image Fundamentals II 12 th June, 2017

Acoustic Signals Recognition by Convolutional Neural Network

Eur Ing Dr. Lei Zhang Faculty of Engineering and Applied Science University of Regina Canada

Image Filtering in Spatial domain. Computer Vision Jia-Bin Huang, Virginia Tech

Practical Image and Video Processing Using MATLAB

Preparing Remote Sensing Data for Natural Resources Mapping (image enhancement, rectifications )

MULTI-MODULAR ARCHITECTURE BASED ON CONVOLUTIONAL NEURAL NETWORKS FOR ONLINE HANDWRITTEN CHARACTER RECOGNITION

Filtering Images in the Spatial Domain Chapter 3b G&W. Ross Whitaker (modified by Guido Gerig) School of Computing University of Utah

Convolutional Networks for Images, Speech, and. Time-Series. 101 Crawfords Corner Road Operationnelle, Universite de Montreal,

Kernels and Support Vector Machines

Robustness (cont.); End-to-end systems

CHANGE DETECTION BY THE IR-MAD AND KERNEL MAF METHODS IN LANDSAT TM DATA COVERING A SWEDISH FOREST REGION

Colorful Image Colorizations Supplementary Material

Image Classification using Convolutional Neural Networks

A Novel Method for Enhancing Satellite & Land Survey Images Using Color Filter Array Interpolation Technique (CFA)

Color Constancy Using Standard Deviation of Color Channels

Machine Learning for Antenna Array Failure Analysis

Neural network pruning for feature selection Application to a P300 Brain-Computer Interface

Tonemapping and bilateral filtering

THE problem of automating the solving of

Synthetic View Generation for Absolute Pose Regression and Image Synthesis: Supplementary material

A Deep Learning Approach To Universal Image Manipulation Detection Using A New Convolutional Layer

Reinforcement Learning Agent for Scrolling Shooter Game

Chapter 17. Shape-Based Operations

Image Forgery Detection Using Svm Classifier

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising

02/02/10. Image Filtering. Computer Vision CS 543 / ECE 549 University of Illinois. Derek Hoiem

A Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16

Vision Review: Image Processing. Course web page:

DETECTION AND RECOGNITION OF HAND GESTURES TO CONTROL THE SYSTEM APPLICATIONS BY NEURAL NETWORKS. P.Suganya, R.Sathya, K.

Compact Deep Convolutional Neural Networks for Image Classification

Image Processing Computer Graphics I Lecture 20. Display Color Models Filters Dithering Image Compression

CSE 564: Scientific Visualization

Lane Detection in Automotive

Image acquisition. Midterm Review. Digitization, line of image. Digitization, whole image. Geometric transformations. Interpolation 10/26/2016

Automatic Speech Recognition (CS753)

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016

Motion illusion, rotating snakes

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices

PROJECT REPORT. Using Deep Learning to Classify Malignancy Associated Changes

Transcription:

Coursework 2 MLP Lecture 7 Convolutional Networks 1

Coursework 2 - Overview and Objectives Overview: Use a selection of the techniques covered in the course so far to train accurate multi-layer networks for MNIST classification Objective: Assess your ability to design, implement and run a set of experiments to answer specific research questions about the models and methods covered in MLP Choose three topics one simpler, two more complex Simpler topics include exploration of: early stopping; L1 vs L2 regularization; number of layers; hidden unit transfer functions; preprocessing of input data More complex topics: data augmentation;. convoltional layers; skip connections / ResNets; Batch normalisation;... MLP Lecture 7 Convolutional Networks 2

Coursework 2 - What to submit Submit a report (PDF), your notebook, and python code. Primarily assessed on the report For each topic: Clear statement of the research question investigated; Clear description of methods and algorithms; Motivation for each experiment completed; Quantitative results including relevant graphs; Discussion of your results and any conclusions you have drawn. Please Do submit everything online using submit Don t submit on paper to the ITO Don t submit everything in your mlpractical directory Do start running the experiments for this coursework as early as possible Some of the experiments may take significant compute time MLP Lecture 7 Convolutional Networks 3

Can we design a network that takes account of the image structure? (And learns invariances...) MLP Lecture 7 Convolutional Networks 4

Convolutional Networks Steve Renals Machine Learning Practical MLP Lecture 7 2 November 2016 MLP Lecture 7 Convolutional Networks 5

Recap: Multi-layer network for MNIST (image from: Michael Nielsen, Neural Networks and Deep Learning, http://neuralnetworksanddeeplearning.com/chap6.html) MLP Lecture 7 Convolutional Networks 6

How can we make this better? On MNIST, we can get about 2% error (or even better) using these kind of networks, but They ignore the spatial (2-D) structure of the input images unroll each 28x28 image into a 784-D vector Each hidden unit looks at the units in the layer below, so pixels that are spatially separate are treated the same way as pixels that are adjacent There is no obvious way for networks to learn the same features (e.g. edges) at different places in the input image MLP Lecture 7 Convolutional Networks 7

Convolutional networks Convolutional networks address these issues through Local receptive fields in which hidden units are connected to local patches of the layer below, Weight sharing which enables the construction of feature maps, Pooling which condenses information from the previous layer. MLP Lecture 7 Convolutional Networks 8

Fully connected hidden layer 576 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 9

Fully connected hidden layer 576 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 9

Fully connected hidden layer 576 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 9

Local receptive fields 24x24 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 10

Local receptive fields 24x24 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 10

Local receptive fields 24x24 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 10

Local receptive fields 24x24 hidden units Input 28x28 Hidden 24x24 MLP Lecture 7 Convolutional Networks 10

Local receptive fields Each hidden unit is connected to a small (m m) region of the input space the local receptive field If we have a d d input space, then we have (d m + 1) (d m + 1) hidden unit space Each hidden unit extracts a feature from its region of input space Here the receptive field stride length is 1, it could be larger MLP Lecture 7 Convolutional Networks 11

Shared weights Constrain each hidden unit h i,j to extract the same feature by sharing weights across the receptive fields For hidden unit h i,j m 1 h i,j = sigmoid( k=0 m 1 l=0 w k,l x i+k,j+l + b) where w k,l are elements of the shared m m weight matrix w, b is the shared bias, and x i+k,j+l is the input at i + k, j + l We use k and l to index into the receptive field, whose top left corner is at x i,j MLP Lecture 7 Convolutional Networks 12

Shared weights & Receptive Fields k x(i,j) l x(i,j+4) h(i,j) x(i+4,j) x(i+4,j+4) Input 28x28 24x24 Feature Map MLP Lecture 7 Convolutional Networks 13

Feature Maps Local receptive fields with shared weights result in a feature map a map showing where the feature corresponding to the shared weight matrix (kernel) occurs in the image Feature map encodes translation invariance extract the same features irrespective of where an image is located in the input Multiple feature maps a hidden layer can consist of F different feature maps in this case F 24 24 units in total MLP Lecture 7 Convolutional Networks 14

Feature Maps Input 28x28 24x24 Feature Map MLP Lecture 7 Convolutional Networks 15

Feature Maps Input 28x28 2x24x24 Feature Maps MLP Lecture 7 Convolutional Networks 15

Feature Maps Input 28x28 3x24x24 Feature Maps MLP Lecture 7 Convolutional Networks 15

Weights and Connections Consider an MNIST hidden layer with feature maps using a 5x5 kernels (resulting in 24x24 feature maps): Number of connections per feature map: 24 24 5 5 = 14, 400 connections 24 24 = 576 biases But since weights are shared within a feature map, we have 5 5 = 25 weights 1 bias Consider the case where we have 40 feature maps. We will have 1,000 (25 40) weights (+ 40 biases) but 576,000 (+ 23,040) connections In comparison a 100 hidden unit MLP from the first coursework has 784 100 + 100 = 78, 500 input-hidden weights MLP Lecture 7 Convolutional Networks 16

Learning image kernels https://en.wikipedia.org/wiki/ Kernel_(image_processing) Image kernels have been designed and used for feature extraction in image processing (e.g. edge detection) However, we can learn multiple kernel functions (feature maps) by optimising the network cost function Automating feature engineering MLP Lecture 7 Convolutional Networks 17

Convolutional Layer This type of feature map is often called a Convolutional layer We can write the feature map hidden unit equation: m m h i,j = sigmoid( w k,l x i+k,j+l + b) k=1 l=1 h = sigmoid(w x + b) is a cross-correlation and is closely related to a convolution In signal processing a 2D convolution is written as m m H i,j = sigmoid( v k,l x i k,j l + b) k=1 l=1 H = sigmoid(v x + b) If we flip (reflect horizontally and vertically) w (cross-correlation) then we obtain v (convolution) MLP Lecture 7 Convolutional Networks 18

Convolution vs Cross-correlation Cross-correlation is often referred to as convolution in deep learning... This is not problematic since the specific properties of convolution but not of cross-correlation (commutativity and associativity) are rarely (if ever) required for deep learning In machine learning the network learns the kernel appropriate to its orientation so if convolution is implemented with a flipped kernel, it will learn that it is a flipped implementation So it is OK to use an efficient (flipped) implementation of convolution for convolutional layers MLP Lecture 7 Convolutional Networks 19

Pooling (subsampling) 12x12 Pooling Layer 24x24 Feature Map MLP Lecture 7 Convolutional Networks 20

Pooling Pooling or subsampling takes a feature map and reduces it in size e.g. by transforming a set of 2x2 regions to a single unit Pooling functions Max-pooling takes the maximum value of the units in the region (c.f. maxout) L p -pooling take the L p norm of the units in the region: h = i region Average- / Sum-pooling takes the average / sum value of the pool Information reduction pooling removes precise location information for a feature h p i 1/p Apply pooling to each feature map separately MLP Lecture 7 Convolutional Networks 21

Putting it together convolutional+pooling layer 3x12x12 Pooling Layers Input 28x28 3x24x24 Feature Maps MLP Lecture 7 Convolutional Networks 22

ConvNet Convolutional Network 3x12x12 Pooling Layers Hidden Layer Softmax Output Layer Simple ConvNet: Input 28x28 3x24x24 Feature Maps Convolutional layer with max-pooling Final fully connected hidden layer (no sharing weight) Softmax output layer With 20 feature maps and a final hidden layer of 100 hidden unit: 20 (5 5 + 1) + 20 12 12 100 + 100 + 100 10 + 10 = 289, 630 weights MLP Lecture 7 Convolutional Networks 23

Multiple input images If we have a colour image, each pixel is defined by 3 RGB values so our input is in fact 3 images (one R, one G, and one B) If we want stack convolutional layers, then the second layer needs to take input from all the feature maps in the first layer Local receptive fields across multiple input images In a second convolutional layer (C2) on top of 20 12 12 feature maps, each unit will look at 20 5 5 input units(combining 20 receptive fields each in the same spatial location) Typically do not tie weights across feature maps, so each unit in C2 has 20 5 5 = 500 weights, plus a bias. (Assuming a 5 5 kernel size) MLP Lecture 7 Convolutional Networks 24

Stacking convolutional layers 6x4x4 Pooling Layers 6x8x8Feature Maps 3x12x12 Pooling Layers Input 28x28 3x24x24 Feature Maps MLP Lecture 7 Convolutional Networks 25

Example: LeNet5 (LeCun et al, 1997) MLP Lecture 7 Convolutional Networks 26

MNIST Results (1997) Fig. 9. Error rate on the test set (%) for various classification MLP Lecture methods. 7[deslant] Convolutional indicates that the Networks 27

Training Convolutional Networks Train convolutional networks with a straightforward but careful application of backprop / SGD Exercise: prior to the next lecture, write down the gradients for the weights and biases of the feature maps in a convolutional network. Remember to take account of weight sharing. Next lecture: implementing convolutional networks: how to deal with local receptive fields and tied weights, computing the required gradients... MLP Lecture 7 Convolutional Networks 28

Summary Convolutional networks include local receptive fields, weight sharing, and pooling leading to: Modelling the spatial structure Translation invariance Local feature detection Reading: Michael Nielsen, Neural Networks and Deep Learning (ch 6) http://neuralnetworksanddeeplearning.com/chap6.html Yann LeCun et al, Gradient-Based Learning Applied to Document Recognition, Proc IEEE, 1998. http://dx.doi.org/10.1109/5.726791 Ian Goodfellow, Yoshua Bengio & Aaron Courville, Deep Learning (ch 9) http://www.deeplearningbook.org/contents/convnets.html MLP Lecture 7 Convolutional Networks 29