Lecture 23 Deep Learning: Segmentation

Size: px
Start display at page:

Download "Lecture 23 Deep Learning: Segmentation"

Transcription

1 Lecture 23 Deep Learning: Segmentation COS 429: Computer Vision Thanks: most of these slides shamelessly adapted from Stanford CS231n: Convolutional Neural Networks for Visual Recognition Fei-Fei Li, Andrej Karpathy, Justin Johnson COS429 : : Andras Ferencz

2 2 : COS429 : L23 : : Andras Ferencz

3 Szegedy et al, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, arxiv : COS429 : L23 : : Andras Ferencz 3

4 Computer Vision Tasks Classification Classification + Localization Object Detection Instance Segmentation CAT CAT CAT, DOG, DUCK CAT, DOG, DUCK Single object 4 : COS429 : L23 : : Andras Ferencz Multiple objects 4

5 Simple Recipe for Classification + Localization Step 2: Attach new fully-connected regression head to the network Fully-connected layers Classification head Convolution and Pooling Class scores Fully-connected layers Regression head Image Final conv feature map 5 : COS429 : L23 : : Andras Ferencz Box coordinates 5

6 Sliding Window: Overfeat 0.5 Network input: 3 x 221 x 221 Larger image: 3 x 257 x : COS429 : L23 : : Andras Ferencz 0.75 Classification scores: P(cat) 6

7 Sliding Window: Overfeat Network input: 3 x 221 x 221 Larger image: 3 x 257 x : COS429 : L23 : : Andras Ferencz Classification scores: P(cat) 7

8 Sliding Window: Overfeat Network input: 3 x 221 x 221 Larger image: 3 x 257 x : COS429 : L23 : : Andras Ferencz Classification scores: P(cat) 8

9 Sliding Window: Overfeat Greedily merge boxes and scores (details in paper) 0.8 Network input: 3 x 221 x 221 Larger image: 3 x 257 x : COS429 : L23 : : Andras Ferencz Classification score: P(cat) 9

10 Sliding Window: Overfeat In practice use many sliding window locations and multiple scales Window positions + score maps Final Predictions Box regression outputs Sermanet et al, Integrated Recognition, Localization and Detection using Convolutional Networks, ICLR : COS429 : L23 : : Andras Ferencz 10

11 Efficient Sliding Window: Overfeat Efficient sliding window by converting fullyconnected layers into convolutions 4096 x 1 x 1 Convolution + pooling Class scores: 1000 x 1 x x 1 x 1 1 x 1 conv 1 x 1 conv 5x5 conv 5x5 conv Image: 3 x 221 x 221 Feature map: 1024 x 5 x 5 1 x 1 conv 4096 x 1 x 1 11 : COS429 : L23 : : Andras Ferencz 1 x 1 conv 1024 x 1 x 1 Box coordinates: (4 x 1000) x 1 x 1 11

12 Efficient Sliding Window: Overfeat Training time: Small image, 1 x 1 classifier output Test time: Larger image, 2 x 2 classifier output, only extra compute at yellow regions Sermanet et al, Integrated Recognition, Localization and Detection using Convolutional Networks, ICLR : COS429 : L23 : : Andras Ferencz 12

13 Computer Vision Tasks Classification Classification + Localization 13 : COS429 : L23 : : Andras Ferencz Object Detection Instance Segmentation 13

14 Region Proposals Find blobby image regions that are likely to contain objects Class-agnostic object detector Look for blob-like regions 14 : COS429 : L23 : : Andras Ferencz 14

15 Region Proposals: Selective Search Bottom-up segmentation, merging regions at multiple scales Convert regions to boxes Uijlings et al, Selective Search for Object Recognition, IJCV : COS429 : L23 : : Andras Ferencz 15

16 R-CNN Girschick et al, Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR 2014 Slide credit: Ross Girschick 16 : COS429 : L23 : : Andras Ferencz 16

17 Fast R-CNN R-CNN Problems: Slow at test-time due to independent forward passes of the CNN Solution: Share computation of convolutional layers between proposals for an image R-CNN Problems: - Post-hoc training: CNN not updated in response to final classifiers and regressors - Complex training pipeline Solution: Just train the whole system end-to-end all at once! 17 : COS429 : L23 : : Andras Ferencz

18 Fast R-CNN: Region of Interest Pooling Convolution and Pooling Hi-res input image: 3 x 800 x 600 with region proposal Can back propagate similar to max pooling Hi-res conv features: CxHxW with region proposal 18 : COS429 : L23 : : Andras Ferencz Fully-connected layers RoI conv features: Cxhxw for region proposal Fully-connected layers expect low-res conv features: Cxhxw 18

19 Faster R-CNN: Training In the paper: Ugly pipeline - Use alternating optimization to train RPN, then Fast R-CNN with RPN proposals, etc. - More complex than it has to be Since publication: Joint training! One network, four losses - RPN classification (anchor good / bad) - RPN regression (anchor -> proposal) - Fast R-CNN classification (over classes) - Fast R-CNN regression (proposal -> box) Slide credit: Ross Girschick 19 : COS429 : L23 : : Andras Ferencz 19

20 Faster R-CNN: Results R-CNN Fast R-CNN Faster R-CNN Test time per image 50 seconds (with proposals) 2 seconds 0.2 seconds (Speedup) 1x 25x 250x map (VOC 2007) : COS429 : L23 : : Andras Ferencz 20

21 Object Detection State-of-the-art: ResNet Faster R-CNN + some extras He et. al, Deep Residual Learning for Image Recognition, arxiv : COS429 : L23 : : Andras Ferencz 21

22 ImageNet Detection : COS429 : L23 : : Andras Ferencz 22

23 YOLO: You Only Look Once Detection as Regression Divide image into S x S grid Within each grid cell predict: B Boxes: 4 coordinates + confidence Class scores: C numbers Regression from image to 7 x 7 x (5 * B + C) tensor Direct prediction using a CNN Redmon et al, You Only Look Once: Unified, Real-Time Object Detection, arxiv : COS429 : L23 : : Andras Ferencz 23

24 YOLO: You Only Look Once Detection as Regression Faster than Faster R-CNN, but not as good Redmon et al, You Only Look Once: Unified, Real-Time Object Detection, arxiv : COS429 : L23 : : Andras Ferencz 24

25 Computer Vision Tasks Classification Classification + Localization Object Detection Segmentation CAT CAT CAT, DOG, DUCK CAT, DOG, DUCK Single object 25 : COS429 : L23 : : Andras Ferencz Multiple objects 25 25

26 Today Classification Classification + Localization Object Detection Segmentation Today 26 : COS429 : L23 : : Andras Ferencz

27 Semantic Segmentation Label every pixel! Don t differentiate instances (cows) Classic computer vision problem Figure credit: Shotton et al, TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context, IJCV : COS429 : L23 : : Andras Ferencz 27

28 Instance Segmentation Detect instances, give category, label pixels simultaneous detection and segmentation (SDS) Lots of recent work (MS-COCO) Figure credit: Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv : COS429 : L23 : : Andras Ferencz 28

29 Semantic Segmentation Extract patch 29 : COS429 : L23 : : Andras Ferencz 29

30 Semantic Segmentation Extract patch Run through a CNN CNN 30 : COS429 : L23 : : Andras Ferencz 30

31 Semantic Segmentation Extract patch 31 : COS429 : L23 : : Andras Ferencz Run through a CNN Classify center pixel CNN COW 31

32 Semantic Segmentation Extract patch Run through a CNN Classify center pixel CNN COW Repeat for every pixel 32 : COS429 : L23 : : Andras Ferencz 32

33 Semantic Segmentation Run fully convolutional network to get all pixels at once CNN 33 : COS429 : L23 : : Andras Ferencz Smaller output due to pooling 33

34 Semantic Segmentation: Multi-Scale Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 34

35 Semantic Segmentation: Multi-Scale Resize image to multiple scales Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 35

36 Semantic Segmentation: Multi-Scale Resize image to multiple scales Run one CNN per scale Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 36

37 Semantic Segmentation: Multi-Scale Resize image to multiple scales Run one CNN per scale Upscale outputs and concatenate Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 37

38 Semantic Segmentation: Multi-Scale Resize image to multiple scales Run one CNN per scale Upscale outputs and concatenate External bottom-up segmentation Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 38

39 Semantic Segmentation: Multi-Scale Resize image to multiple scales Run one CNN per scale Upscale outputs and concatenate External bottom-up segmentation Farabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI : COS429 : L23 : : Andras Ferencz 39 Combine everything for final outputs

40 Semantic Segmentation: Refinement Apply CNN once to get labels Pinheiro and Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML : COS429 : L23 : : Andras Ferencz 40

41 Semantic Segmentation: Refinement Apply CNN once to get labels Apply AGAIN to refine labels Pinheiro and Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML : COS429 : L23 : : Andras Ferencz 41

42 Semantic Segmentation: Refinement Same CNN weights: recurrent convolutional network Apply CNN once to get labels Apply AGAIN to refine labels And again! More iterations improve results Pinheiro and Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML : COS429 : L23 : : Andras Ferencz 42

43 Semantic Segmentation: Upsampling Learnable upsampling! Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR : COS429 : L23 : : Andras Ferencz 43

44 Semantic Segmentation: Upsampling Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR : COS429 : L23 : : Andras Ferencz 44

45 Semantic Segmentation: Upsampling skip connections Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR : COS429 : L23 : : Andras Ferencz 45

46 Semantic Segmentation: Upsampling skip connections Skip connections = Better results Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation, CVPR : COS429 : L23 : : Andras Ferencz 46

47 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 1 pad 1 Input: 4 x 4 47 : COS429 : L23 : : Andras Ferencz Output: 4 x 4 47

48 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 1 pad 1 Dot product between filter and input Input: 4 x 4 48 : COS429 : L23 : : Andras Ferencz Output: 4 x 4 48

49 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 1 pad 1 Dot product between filter and input Input: 4 x 4 49 : COS429 : L23 : : Andras Ferencz Output: 4 x 4 49

50 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 2 pad 1 Input: 4 x 4 50 : COS429 : L23 : : Andras Ferencz Output: 2 x 2 50

51 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 2 pad 1 Dot product between filter and input Input: 4 x 4 51 : COS429 : L23 : : Andras Ferencz Output: 2 x 2 51

52 Learnable Upsampling: Deconvolution Typical 3 x 3 convolution, stride 2 pad 1 Dot product between filter and input Input: 4 x 4 52 : COS429 : L23 : : Andras Ferencz Output: 2 x 2 52

53 Learnable Upsampling: Deconvolution 3 x 3 deconvolution, stride 2 pad 1 Input: 2 x 2 53 : COS429 : L23 : : Andras Ferencz Output: 4 x 4 53

54 Learnable Upsampling: Deconvolution 3 x 3 deconvolution, stride 2 pad 1 Input gives weight for filter Input: 2 x 2 54 : COS429 : L23 : : Andras Ferencz Output: 4 x 4 54

55 Learnable Upsampling: Deconvolution 3 x 3 deconvolution, stride 2 pad 1 Sum where output overlaps Same as backward pass for normal convolution! Deconvolution is a bad name, already defined as inverse of convolution Input gives weight for filter Input: 2 x 2 55 : COS429 : L23 : : Andras Ferencz Better names: convolution transpose, backward strided convolution, 1/2 strided convolution, upconvolution Output: 4 x 4 55

56 Semantic Segmentation: Upsampling Normal VGG Noh et al, Learning Deconvolution Network for Semantic Segmentation, ICCV : COS429 : L23 : : Andras Ferencz Upside down VGG 6 days of training on Titan X 56

57 Instance Segmentation Detect instances, give category, label pixels simultaneous detection and segmentation (SDS) Lots of recent work (MS-COCO) Figure credit: Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv : COS429 : L23 : : Andras Ferencz 57

58 Instance Segmentation Similar to R-CNN, but with segments Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 58

59 Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 59

60 Instance Segmentation Similar to R-CNN External Segment proposals Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 60

61 Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Mask out background with mean image Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 61

62 Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Mask out background with mean image Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 62

63 Instance Segmentation Similar to R-CNN, but with segments External Segment proposals Mask out background with mean image Hariharan et al, Simultaneous Detection and Segmentation, ECCV : COS429 : L23 : : Andras Ferencz 63

64 Instance Segmentation: Cascades Similar to Faster R-CNN Won COCO 2015 challenge (with ResNet) Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv : COS429 : L23 : : Andras Ferencz 64

65 Instance Segmentation: Cascades Similar to Faster R-CNN Region proposal network (RPN) Won COCO 2015 challenge (with ResNet) Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv : COS429 : L23 : : Andras Ferencz 65

66 Instance Segmentation: Cascades Similar to Faster R-CNN Region proposal network (RPN) Reshape boxes to fixed size, figure / ground logistic regression Learn entire model end-to-end! Mask out background, predict object class Won COCO 2015 challenge (with ResNet) Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv : COS429 : L23 : : Andras Ferencz 66

67 Instance Segmentation: Cascades Dai et al, Instance-aware Semantic Segmentation via Multi-task Network Cascades, arxiv 2015 Predictions 67 : COS429 : L23 : : Andras Ferencz Ground truth 67

68 Segmentation Overview Semantic segmentation Classify all pixels Fully convolutional models, downsample then upsample Learnable upsampling: fractionally strided convolution Skip connections can help Instance Segmentation Detect instance, generate mask Similar pipelines to object detection 68 : COS429 : L23 : : Andras Ferencz 68

69 Quick overview of Other Topics 69 : COS429 : L23 : : Andras Ferencz 69

70 Recurrent Neural Networks (RNN) Vanilla Neural Networks 70 : COS429 : L23 : : Andras Ferencz 70

71 Recurrent Neural Networks (RNN) e.g. Image Captioning image -> sequence of words 71 : COS429 : L23 : : Andras Ferencz 71

72 Recurrent Neural Networks (RNN) e.g. Sentiment Classification sequence of words -> sentiment 72 : COS429 : L23 : : Andras Ferencz 72

73 Recurrent Neural Networks (RNN) e.g. Machine Translation seq of words -> seq of words 73 : COS429 : L23 : : Andras Ferencz 73

74 Recurrent Neural Networks (RNN) e.g. Video classification on frame level 74 : COS429 : L23 : : Andras Ferencz 74

75 y RNN x 75 : COS429 : L23 : : Andras Ferencz 75

76 Character RNN during training train more train more train more 76 : COS429 : L23 : : Andras Ferencz

77 77 : COS429 : L23 : : Andras Ferencz 77

78 Generated C code 78 : COS429 : L23 : : Andras Ferencz 78

79 Searching for interpretable cells quote detection cell 79 : COS429 : L23 : : Andras Ferencz 79

80 Sequential Processing of fixed inputs Multiple Object Recognition with Visual Attention, Ba et al. 80 : COS429 : L23 : : Andras Ferencz

81 Sequential Processing of fixed outputs DRAW: A Recurrent Neural Network For Image Generation, Gregor et al. 81 : COS429 : L23 : : Andras Ferencz

82 Image Captioning Explain Images with Multimodal Recurrent Neural Networks, Mao et al. Deep Visual-Semantic Alignments for Generating Image Descriptions, Karpathy and Fei-Fei Show and Tell: A Neural Image Caption Generator, Vinyals et al. Long-term Recurrent Convolutional Networks for Visual Recognition and Description, Donahue et al. Learning a Recurrent Visual Representation for Image Caption Generation, Chen and Zitnick 82 : COS429 : L23 : : Andras Ferencz 82

83 Recurrent Neural Network Convolutional Neural Network 83 : COS429 : L23 : : Andras Ferencz 83

84 Soft Attention for Captioning Distribution over L locations a1 CNN Image: HxWx3 Xu et al, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML 2015 Distribution over vocab a2 h0 d1 h1 h2 Features: LxD Weighted features: D Weighted combination of features 84 : COS429 : L23 : : Andras Ferencz z1 y1 z2 y2 First word 84

85 Soft Attention for Captioning Soft attention Hard attention Xu et al, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML : COS429 : L23 : : Andras Ferencz 85

86 Soft Attention for Captioning Xu et al, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, ICML : COS429 : L23 : : Andras Ferencz 86

87 Spatial Transformer Networks Can we make this function differentiable? Input image: HxWx3 Cropped and rescaled image: XxYx3 Box Coordinates: (xc, yc, w, h) Jaderberg et al, Spatial Transformer Networks, NIPS : COS429 : L23 : : Andras Ferencz

88 Spatial Transformer Networks Can we make this function differentiable? Input image: HxWx3 Idea: Function mapping pixel coordinates (xt, yt) of output to pixel coordinates (xs, ys) of input Repeat for all pixels in output to get a sampling grid Cropped and rescaled image: XxYx3 Then use bilinear interpolation to compute output Box Coordinates: (xc, yc, w, h) Jaderberg et al, Spatial Transformer Networks, NIPS : COS429 : L23 : : Andras Ferencz Network attends to input by predicting

89 Spatial Transformer Networks Grid generator uses to compute sampling grid A small Localization network predicts transform Input: Full image Output: Region of interest from input Sampler uses bilinear interpolation to produce output 89 : COS429 : L23 : : Andras Ferencz

90 Spatial Transformer Networks Insert spatial transformers into a classification network and it learns to attend and transform the input Differentiable attention / transformation module 90 : COS429 : L23 : : Andras Ferencz 90

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 -

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 - Lecture 11: Detection and Segmentation Lecture 11-1 May 10, 2017 Administrative Midterms being graded Please don t discuss midterms until next week - some students not yet taken A2 being graded Project

More information

CS 7643: Deep Learning

CS 7643: Deep Learning CS 7643: Deep Learning Topics: Toeplitz matrices and convolutions = matrix-mult Dilated/a-trous convolutions Backprop in conv layers Transposed convolutions Dhruv Batra Georgia Tech HW1 extension 09/22

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Presented by: Gordon Christie 1 Overview Reinterpret standard classification convnets as

More information

A Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16

A Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1 pixels in, pixels out colorization Zhang et al.2016 monocular depth

More information

Colorful Image Colorizations Supplementary Material

Colorful Image Colorizations Supplementary Material Colorful Image Colorizations Supplementary Material Richard Zhang, Phillip Isola, Alexei A. Efros {rich.zhang, isola, efros}@eecs.berkeley.edu University of California, Berkeley 1 Overview This document

More information

Lecture 11-1 CNN introduction. Sung Kim

Lecture 11-1 CNN introduction. Sung Kim Lecture 11-1 CNN introduction Sung Kim 'The only limit is your imagination' http://itchyi.squarespace.com/thelatest/2012/5/17/the-only-limit-is-your-imagination.html Lecture 7: Convolutional

More information

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland An Introduction to Convolutional Neural Networks Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland Sources & Resources - Andrej Karpathy, CS231n http://cs231n.github.io/convolutional-networks/

More information

Semantic Segmentation on Resource Constrained Devices

Semantic Segmentation on Resource Constrained Devices Semantic Segmentation on Resource Constrained Devices Sachin Mehta University of Washington, Seattle In collaboration with Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi Project

More information

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 1 Olaf Ronneberger, Philipp Fischer, Thomas Brox (Freiburg, Germany) 2 Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (POSTECH,

More information

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition ECE 289G: Paper Presentation #3 Philipp Gysel Autonomous Car ECE 289G Paper Presentation, Philipp Gysel Slide 2 Source: maps.google.com

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Convolution, LeNet, AlexNet, VGGNet, GoogleNet, Resnet, DenseNet, CAM, Deconvolution Sept 17, 2018 Aaditya Prakash Convolution Convolution Demo Convolution Convolution in

More information

Convolutional neural networks

Convolutional neural networks Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions

More information

Deep Learning. Dr. Johan Hagelbäck.

Deep Learning. Dr. Johan Hagelbäck. Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:

More information

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO Introduction to RNNs for NLP SHANG GAO About Me PhD student in the Data Science and Engineering program Took Deep Learning last year Work in the Biomedical Sciences, Engineering, and Computing group at

More information

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS Kuan-Chuan Peng and Tsuhan Chen Cornell University School of Electrical and Computer Engineering Ithaca, NY 14850

More information

Deep Learning Basics Lecture 9: Recurrent Neural Networks. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 9: Recurrent Neural Networks. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 9: Recurrent Neural Networks Princeton University COS 495 Instructor: Yingyu Liang Introduction Recurrent neural networks Dates back to (Rumelhart et al., 1986) A family of

More information

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising Peng Liu University of Florida pliu1@ufl.edu Ruogu Fang University of Florida ruogu.fang@bme.ufl.edu arxiv:177.9135v1 [cs.cv]

More information

Visualizing and Understanding. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 12 -

Visualizing and Understanding. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 12 - Lecture 12: Visualizing and Understanding Lecture 12-1 May 16, 2017 Administrative Milestones due tonight on Canvas, 11:59pm Midterm grades released on Gradescope this week A3 due next Friday, 5/26 HyperQuest

More information

tsushi Sasaki Fig. Flow diagram of panel structure recognition by specifying peripheral regions of each component in rectangles, and 3 types of detect

tsushi Sasaki Fig. Flow diagram of panel structure recognition by specifying peripheral regions of each component in rectangles, and 3 types of detect RECOGNITION OF NEL STRUCTURE IN COMIC IMGES USING FSTER R-CNN Hideaki Yanagisawa Hiroshi Watanabe Graduate School of Fundamental Science and Engineering, Waseda University BSTRCT For efficient e-comics

More information

What Is And How Will Machine Learning Change Our Lives. Fair Use Agreement

What Is And How Will Machine Learning Change Our Lives. Fair Use Agreement What Is And How Will Machine Learning Change Our Lives Raymond Ptucha, Rochester Institute of Technology 2018 Engineering Symposium April 24, 2018, 9:45am Ptucha 18 1 Fair Use Agreement This agreement

More information

En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring

En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring Mathilde Ørstavik og Terje Midtbø Mathilde Ørstavik and Terje Midtbø, A New Era for Feature Extraction in Remotely Sensed

More information

arxiv: v1 [cs.cv] 15 Apr 2016

arxiv: v1 [cs.cv] 15 Apr 2016 High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks arxiv:1604.04339v1 [cs.cv] 15 Apr 2016 Zifeng Wu, Chunhua Shen, Anton van den Hengel The University of Adelaide, SA 5005,

More information

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION Journal of Advanced College of Engineering and Management, Vol. 3, 2017 DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION Anil Bhujel 1, Dibakar Raj Pant 2 1 Ministry of Information and

More information

Lecture 7: Scene Text Detection and Recognition. Dr. Cong Yao Megvii (Face++) Researcher

Lecture 7: Scene Text Detection and Recognition. Dr. Cong Yao Megvii (Face++) Researcher Lecture 7: Scene Text Detection and Recognition Dr. Cong Yao Megvii (Face++) Researcher yaocong@megvii.com Outline Background and Introduction Conventional Methods Deep Learning Methods Datasets and Competitions

More information

NU-Net: Deep Residual Wide Field of View Convolutional Neural Network for Semantic Segmentation

NU-Net: Deep Residual Wide Field of View Convolutional Neural Network for Semantic Segmentation NU-Net: Deep Residual Wide Field of View Convolutional Neural Network for Semantic Segmentation Mohamed Samy 1 Karim Amer 1 Kareem Eissa Mahmoud Shaker Mohamed ElHelw Center for Informatics Science Nile

More information

Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks

Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks Jiawei Zhang 1,2 Jinshan Pan 3 Jimmy Ren 2 Yibing Song 4 Linchao Bao 4 Rynson W.H. Lau 1 Ming-Hsuan Yang 5 1 Department of Computer

More information

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1 Lecture 5: Convolutional Neural Networks Lecture 5-1 Administrative Assignment 1 due Thursday April 20, 11:59pm on Canvas Assignment 2 will be released Thursday Lecture 5-2 Last time: Neural Networks Linear

More information

Biologically Inspired Computation

Biologically Inspired Computation Biologically Inspired Computation Deep Learning & Convolutional Neural Networks Joe Marino biologically inspired computation biological intelligence flexible capable of detecting/ executing/reasoning about

More information

Continuous Gesture Recognition Fact Sheet

Continuous Gesture Recognition Fact Sheet Continuous Gesture Recognition Fact Sheet August 17, 2016 1 Team details Team name: ICT NHCI Team leader name: Xiujuan Chai Team leader address, phone number and email Address: No.6 Kexueyuan South Road

More information

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS Bulletin of the Transilvania University of Braşov Vol. 10 (59) No. 2-2017 Series I: Engineering Sciences ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS E. HORVÁTH 1 C. POZNA 2 Á. BALLAGI 3

More information

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Emeric Stéphane Boigné eboigne@stanford.edu Jan Felix Heyse heyse@stanford.edu Abstract Scaling

More information

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1 Lecture 5: Convolutional Neural Networks Lecture 5-1 Administrative Assignment 1 due Wednesday April 17, 11:59pm - Important: tag your solutions with the corresponding hw question in gradescope! - Some

More information

Understanding Neural Networks : Part II

Understanding Neural Networks : Part II TensorFlow Workshop 2018 Understanding Neural Networks Part II : Convolutional Layers and Collaborative Filters Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Convolutional

More information

Multi-task Learning of Dish Detection and Calorie Estimation

Multi-task Learning of Dish Detection and Calorie Estimation Multi-task Learning of Dish Detection and Calorie Estimation Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585 JAPAN ABSTRACT In recent

More information

arxiv: v1 [cs.cv] 27 Nov 2016

arxiv: v1 [cs.cv] 27 Nov 2016 Real-Time Video Highlights for Yahoo Esports arxiv:1611.08780v1 [cs.cv] 27 Nov 2016 Yale Song Yahoo Research New York, USA yalesong@yahoo-inc.com Abstract Esports has gained global popularity in recent

More information

A Neural Algorithm of Artistic Style (2015)

A Neural Algorithm of Artistic Style (2015) A Neural Algorithm of Artistic Style (2015) Leon A. Gatys, Alexander S. Ecker, Matthias Bethge Nancy Iskander (niskander@dgp.toronto.edu) Overview of Method Content: Global structure. Style: Colours; local

More information

Recognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83

Recognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Recognition: Overview Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Textbook This book has a lot of material: K. Grauman and B. Leibe Visual Object Recognition Synthesis Lectures On Computer

More information

Semantic Localization of Indoor Places. Lukas Kuster

Semantic Localization of Indoor Places. Lukas Kuster Semantic Localization of Indoor Places Lukas Kuster Motivation GPS for localization [7] 2 Motivation Indoor navigation [8] 3 Motivation Crowd sensing [9] 4 Motivation Targeted Advertisement [10] 5 Motivation

More information

arxiv: v1 [stat.ml] 10 Nov 2017

arxiv: v1 [stat.ml] 10 Nov 2017 Poverty Prediction with Public Landsat 7 Satellite Imagery and Machine Learning arxiv:1711.03654v1 [stat.ml] 10 Nov 2017 Anthony Perez Department of Computer Science Stanford, CA 94305 aperez8@stanford.edu

More information

Generating an appropriate sound for a video using WaveNet.

Generating an appropriate sound for a video using WaveNet. Australian National University College of Engineering and Computer Science Master of Computing Generating an appropriate sound for a video using WaveNet. COMP 8715 Individual Computing Project Taku Ueki

More information

Domain Adaptation & Transfer: All You Need to Use Simulation for Real

Domain Adaptation & Transfer: All You Need to Use Simulation for Real Domain Adaptation & Transfer: All You Need to Use Simulation for Real Boqing Gong Tecent AI Lab Department of Computer Science An intelligent robot Semantic segmentation of urban scenes Assign each pixel

More information

Semantic Segmentation in Red Relief Image Map by UX-Net

Semantic Segmentation in Red Relief Image Map by UX-Net Semantic Segmentation in Red Relief Image Map by UX-Net Tomoya Komiyama 1, Kazuhiro Hotta 1, Kazuo Oda 2, Satomi Kakuta 2 and Mikako Sano 2 1 Meijo University, Shiogamaguchi, 468-0073, Nagoya, Japan 2

More information

Automatic understanding of the visual world

Automatic understanding of the visual world Automatic understanding of the visual world 1 Machine visual perception Artificial capacity to see, understand the visual world Object recognition Image or sequence of images Action recognition 2 Machine

More information

DSNet: An Efficient CNN for Road Scene Segmentation

DSNet: An Efficient CNN for Road Scene Segmentation DSNet: An Efficient CNN for Road Scene Segmentation Ping-Rong Chen 1 Hsueh-Ming Hang 1 1 National Chiao Tung University {james50120.ee05g, hmhang}@nctu.edu.tw Sheng-Wei Chan 2 Jing-Jhih Lin 2 2 Industrial

More information

Neural Networks The New Moore s Law

Neural Networks The New Moore s Law Neural Networks The New Moore s Law Chris Rowen, PhD, FIEEE CEO Cognite Ventures December 216 Outline Moore s Law Revisited: Efficiency Drives Productivity Embedded Neural Network Product Segments Efficiency

More information

Can you tell a face from a HEVC bitstream?

Can you tell a face from a HEVC bitstream? Can you tell a face from a HEVC bitstream? Saeed Ranjbar Alvar, Hyomin Choi and Ivan V. Bajić School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada Email: {saeedr,chyomin, ibajic}@sfu.ca

More information

arxiv: v2 [cs.cv] 8 Mar 2018

arxiv: v2 [cs.cv] 8 Mar 2018 Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation Liang-Chieh Chen Yukun Zhu George Papandreou Florian Schroff Hartwig Adam Google Inc. {lcchen, yukun, gpapan, fschroff,

More information

Evaluation of Image Segmentation Based on Histograms

Evaluation of Image Segmentation Based on Histograms Evaluation of Image Segmentation Based on Histograms Andrej FOGELTON Slovak University of Technology in Bratislava Faculty of Informatics and Information Technologies Ilkovičova 3, 842 16 Bratislava, Slovakia

More information

Hand Gesture Recognition by Means of Region- Based Convolutional Neural Networks

Hand Gesture Recognition by Means of Region- Based Convolutional Neural Networks Contemporary Engineering Sciences, Vol. 10, 2017, no. 27, 1329-1342 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ces.2017.710154 Hand Gesture Recognition by Means of Region- Based Convolutional

More information

Neural Network Part 4: Recurrent Neural Networks

Neural Network Part 4: Recurrent Neural Networks Neural Network Part 4: Recurrent Neural Networks Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from

More information

arxiv: v3 [cs.cv] 22 Aug 2018

arxiv: v3 [cs.cv] 22 Aug 2018 Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam ariv:1802.02611v3 [cs.cv] 22 Aug 2018

More information

Driving Using End-to-End Deep Learning

Driving Using End-to-End Deep Learning Driving Using End-to-End Deep Learning Farzain Majeed farza@knights.ucf.edu Kishan Athrey kishan.athrey@knights.ucf.edu Dr. Mubarak Shah shah@crcv.ucf.edu Abstract This work explores the problem of autonomously

More information

TRANSFORMING PHOTOS TO COMICS USING CONVOLUTIONAL NEURAL NETWORKS. Tsinghua University, China Cardiff University, UK

TRANSFORMING PHOTOS TO COMICS USING CONVOLUTIONAL NEURAL NETWORKS. Tsinghua University, China Cardiff University, UK TRANSFORMING PHOTOS TO COMICS USING CONVOUTIONA NEURA NETWORKS Yang Chen Yu-Kun ai Yong-Jin iu Tsinghua University, China Cardiff University, UK ABSTRACT In this paper, inspired by Gatys s recent work,

More information

arxiv: v3 [cs.cv] 5 Dec 2017

arxiv: v3 [cs.cv] 5 Dec 2017 Rethinking Atrous Convolution for Semantic Image Segmentation Liang-Chieh Chen George Papandreou Florian Schroff Hartwig Adam Google Inc. {lcchen, gpapan, fschroff, hadam}@google.com arxiv:1706.05587v3

More information

Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks

Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks Jo rg Wagner1,2, Volker Fischer1, Michael Herman1 and Sven Behnke2 1- Robert Bosch GmbH - 70442 Stuttgart - Germany 2-

More information

Understanding Convolution for Semantic Segmentation

Understanding Convolution for Semantic Segmentation Understanding Convolution for Semantic Segmentation Panqu Wang 1, Pengfei Chen 1, Ye Yuan 2, Ding Liu 3, Zehua Huang 1, Xiaodi Hou 1, Garrison Cottrell 4 1 TuSimple, 2 Carnegie Mellon University, 3 University

More information

Impact of Automatic Feature Extraction in Deep Learning Architecture

Impact of Automatic Feature Extraction in Deep Learning Architecture Impact of Automatic Feature Extraction in Deep Learning Architecture Fatma Shaheen, Brijesh Verma and Md Asafuddoula Centre for Intelligent Systems Central Queensland University, Brisbane, Australia {f.shaheen,

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to publication record in Explore Bristol Research PDF-document

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to publication record in Explore Bristol Research PDF-document Hepburn, A., McConville, R., & Santos-Rodriguez, R. (2017). Album cover generation from genre tags. Paper presented at 10th International Workshop on Machine Learning and Music, Barcelona, Spain. Peer

More information

Video Object Segmentation with Re-identification

Video Object Segmentation with Re-identification Video Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen Change Loy, Xiaoou Tang The Chinese University of Hong Kong, SenseTime

More information

Understanding Convolution for Semantic Segmentation

Understanding Convolution for Semantic Segmentation Understanding Convolution for Semantic Segmentation Panqu Wang 1, Pengfei Chen 1, Ye Yuan 2, Ding Liu 3, Zehua Huang 1, Xiaodi Hou 1, Garrison Cottrell 4 1 TuSimple, 2 Carnegie Mellon University, 3 University

More information

Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired

Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired 1 Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired Bing Li 1, Manjekar Budhai 2, Bowen Xiao 3, Liang Yang 1, Jizhong Xiao 1 1 Department of Electrical Engineering, The City College,

More information

arxiv: v1 [cs.cv] 19 Jun 2017

arxiv: v1 [cs.cv] 19 Jun 2017 Satellite Imagery Feature Detection using Deep Convolutional Neural Network: A Kaggle Competition Vladimir Iglovikov True Accord iglovikov@gmail.com Sergey Mushinskiy Open Data Science cepera.ang@gmail.com

More information

An energy-efficient coarse grained spatial architecture for convolutional neural networks AlexNet

An energy-efficient coarse grained spatial architecture for convolutional neural networks AlexNet LETTER IEICE Electronics Express, Vol.14, No.15, 1 12 An energy-efficient coarse grained spatial architecture for convolutional neural networks AlexNet Boya Zhao a), Mingjiang Wang b), and Ming Liu Harbin

More information

Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model

Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model Yuzhou Hu Departmentof Electronic Engineering, Fudan University,

More information

Tracking transmission of details in paintings

Tracking transmission of details in paintings Tracking transmission of details in paintings Benoit Seguin benoit.seguin@epfl.ch Isabella di Lenardo isabella.dilenardo@epfl.ch Frédéric Kaplan frederic.kaplan@epfl.ch Introduction In previous articles

More information

Wadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology

Wadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 1) Available online at www.ijariit.com Hand Detection and Gesture Recognition in Real-Time Using Haar-Classification and Convolutional Neural Networks

More information

Improving a real-time object detector with compact temporal information

Improving a real-time object detector with compact temporal information Improving a real-time object detector with compact temporal information Martin Ahrnbom Lund University martin.ahrnbom@math.lth.se Morten Bornø Jensen Aalborg University mboj@create.aau.dk Håkan Ardö Lund

More information

Learning to Predict Indoor Illumination from a Single Image. Chih-Hui Ho

Learning to Predict Indoor Illumination from a Single Image. Chih-Hui Ho Learning to Predict Indoor Illumination from a Single Image Chih-Hui Ho 1 Outline Introduction Method Overview LDR Panorama Light Source Detection Panorama Recentering Warp Learning From LDR Panoramas

More information

Fully Convolutional Network with dilated convolutions for Handwritten

Fully Convolutional Network with dilated convolutions for Handwritten International Journal on Document Analysis and Recognition manuscript No. (will be inserted by the editor) Fully Convolutional Network with dilated convolutions for Handwritten text line segmentation Guillaume

More information

RAPID: Rating Pictorial Aesthetics using Deep Learning

RAPID: Rating Pictorial Aesthetics using Deep Learning RAPID: Rating Pictorial Aesthetics using Deep Learning Xin Lu 1 Zhe Lin 2 Hailin Jin 2 Jianchao Yang 2 James Z. Wang 1 1 The Pennsylvania State University 2 Adobe Research {xinlu, jwang}@psu.edu, {zlin,

More information

Compositing-aware Image Search

Compositing-aware Image Search Compositing-aware Image Search Hengshuang Zhao 1, Xiaohui Shen 2, Zhe Lin 3, Kalyan Sunkavalli 3, Brian Price 3, Jiaya Jia 1,4 1 The Chinese University of Hong Kong, 2 ByteDance AI Lab, 3 Adobe Research,

More information

arxiv: v1 [cs.cv] 28 Nov 2017 Abstract

arxiv: v1 [cs.cv] 28 Nov 2017 Abstract Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks Zhaofan Qiu, Ting Yao, and Tao Mei University of Science and Technology of China, Hefei, China Microsoft Research, Beijing, China

More information

Scene Text Eraser. arxiv: v1 [cs.cv] 8 May 2017

Scene Text Eraser. arxiv: v1 [cs.cv] 8 May 2017 Scene Text Eraser Toshiki Nakamura, Anna Zhu, Keiji Yanai,and Seiichi Uchida Human Interface Laboratory, Kyushu University, Fukuoka, Japan. Email: {nakamura,uchida}@human.ait.kyushu-u.ac.jp School of Computer,

More information

arxiv: v1 [cs.cv] 9 Nov 2015 Abstract

arxiv: v1 [cs.cv] 9 Nov 2015 Abstract Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding Alex Kendall Vijay Badrinarayanan University of Cambridge agk34, vb292, rc10001 @cam.ac.uk

More information

Automatic point-of-interest image cropping via ensembled convolutionalization

Automatic point-of-interest image cropping via ensembled convolutionalization 1 Automatic point-of-interest image cropping via ensembled convolutionalization Andrea Asperti and Pietro Battilana University of Bologna Department of informatics: Science and Engineering (DISI) Abstract

More information

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni. Lesson 08 Convolutional Neural Network Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni Lesson 08 Convolution we will consider 2D convolution the result

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Deep Learning Barnabás Póczos Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey Hinton Yann LeCun 2

More information

arxiv: v3 [cs.cv] 18 Dec 2018

arxiv: v3 [cs.cv] 18 Dec 2018 Video Colorization using CNNs and Keyframes extraction: An application in saving bandwidth Ankur Singh 1 Anurag Chanani 2 Harish Karnick 3 arxiv:1812.03858v3 [cs.cv] 18 Dec 2018 Abstract In this paper,

More information

GPU ACCELERATED DEEP LEARNING WITH CUDNN

GPU ACCELERATED DEEP LEARNING WITH CUDNN GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION

More information

Research on Hand Gesture Recognition Using Convolutional Neural Network

Research on Hand Gesture Recognition Using Convolutional Neural Network Research on Hand Gesture Recognition Using Convolutional Neural Network Tian Zhaoyang a, Cheng Lee Lung b a Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China E-mail address:

More information

Pelee: A Real-Time Object Detection System on Mobile Devices

Pelee: A Real-Time Object Detection System on Mobile Devices Pelee: A Real-Time Object Detection System on Mobile Devices Robert J. Wang, Xiang Li, Shuang Ao & Charles X. Ling Department of Computer Science University of Western Ontario London, Ontario, Canada,

More information

Object Recognition with and without Objects

Object Recognition with and without Objects Object Recognition with and without Objects Zhuotun Zhu, Lingxi Xie, Alan Yuille Johns Hopkins University, Baltimore, MD, USA {zhuotun, 198808xc, alan.l.yuille}@gmail.com Abstract While recent deep neural

More information

EE-559 Deep learning 7.2. Networks for image classification

EE-559 Deep learning 7.2. Networks for image classification EE-559 Deep learning 7.2. Networks for image classification François Fleuret https://fleuret.org/ee559/ Fri Nov 16 22:58:34 UTC 2018 ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE Image classification, standard

More information

On the Robustness of Deep Neural Networks

On the Robustness of Deep Neural Networks On the Robustness of Deep Neural Networks Manuel Günther, Andras Rozsa, and Terrance E. Boult Vision and Security Technology Lab, University of Colorado Colorado Springs {mgunther,arozsa,tboult}@vast.uccs.edu

More information

Going Deeper into First-Person Activity Recognition

Going Deeper into First-Person Activity Recognition Going Deeper into First-Person Activity Recognition Minghuang Ma, Haoqi Fan and Kris M. Kitani Carnegie Mellon University Pittsburgh, PA 15213, USA minghuam@andrew.cmu.edu haoqif@andrew.cmu.edu kkitani@cs.cmu.edu

More information

MSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos

MSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos MSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos Ting Yao, Yehao Li, Zhaofan Qiu, Fuchen Long, Yingwei Pan, Dong Li,

More information

AUGMENTED CONVOLUTIONAL FEATURE MAPS FOR ROBUST CNN-BASED CAMERA MODEL IDENTIFICATION. Belhassen Bayar and Matthew C. Stamm

AUGMENTED CONVOLUTIONAL FEATURE MAPS FOR ROBUST CNN-BASED CAMERA MODEL IDENTIFICATION. Belhassen Bayar and Matthew C. Stamm AUGMENTED CONVOLUTIONAL FEATURE MAPS FOR ROBUST CNN-BASED CAMERA MODEL IDENTIFICATION Belhassen Bayar and Matthew C. Stamm Department of Electrical and Computer Engineering, Drexel University, Philadelphia,

More information

Sketch-a-Net that Beats Humans

Sketch-a-Net that Beats Humans Sketch-a-Net that Beats Humans Qian Yu SketchLab@QMUL Queen Mary University of London 1 Authors Qian Yu Yongxin Yang Yi-Zhe Song Tao Xiang Timothy Hospedales 2 Let s play a game! Round 1 Easy fish face

More information

arxiv: v5 [cs.cv] 23 Aug 2017

arxiv: v5 [cs.cv] 23 Aug 2017 DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows arxiv:111.555v5 [cs.cv] 3 Aug 17 Jason Kuen 1 jkuen1@ntu.edu.sg Xiangfei Kong 1 xfkong@ntu.edu.sg Gang Wang gangwang@gmail.com

More information

Scene Perception based on Boosting over Multimodal Channel Features

Scene Perception based on Boosting over Multimodal Channel Features Scene Perception based on Boosting over Multimodal Channel Features Arthur Costea Image Processing and Pattern Recognition Research Center Technical University of Cluj-Napoca Research Group Technical University

More information

Image Manipulation Detection using Convolutional Neural Network

Image Manipulation Detection using Convolutional Neural Network Image Manipulation Detection using Convolutional Neural Network Dong-Hyun Kim 1 and Hae-Yeoun Lee 2,* 1 Graduate Student, 2 PhD, Professor 1,2 Department of Computer Software Engineering, Kumoh National

More information

Computer Vision Seminar

Computer Vision Seminar Computer Vision Seminar 236815 Spring 2017 Instructor: Micha Lindenbaum (Taub 600, Tel: 4331, email: mic@cs) Student in this seminar should be those interested in high level, learning based, computer vision.

More information

Semantic Segmented Style Transfer Kevin Yang* Jihyeon Lee* Julia Wang* Stanford University kyang6

Semantic Segmented Style Transfer Kevin Yang* Jihyeon Lee* Julia Wang* Stanford University kyang6 Semantic Segmented Style Transfer Kevin Yang* Jihyeon Lee* Julia Wang* Stanford University kyang6 Stanford University jlee24 Stanford University jwang22 Abstract Inspired by previous style transfer techniques

More information

On Emerging Technologies

On Emerging Technologies On Emerging Technologies 9.11. 2018. Prof. David Hyunchul Shim Director, Korea Civil RPAS Research Center KAIST, Republic of Korea hcshim@kaist.ac.kr 1 I. Overview Recent emerging technologies in civil

More information

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks Zhaofan Qiu, Ting Yao, and Tao Mei University of Science and Technology of China, Hefei, China Microsoft Research, Beijing, China

More information

Learning Rich Features for Image Manipulation Detection

Learning Rich Features for Image Manipulation Detection Learning Rich Features for Image Manipulation Detection Peng Zhou Xintong Han Vlad I. Morariu Larry S. Davis University of Maryland, College Park Adobe Research pengzhou@umd.edu {xintong,lsd}@umiacs.umd.edu

More information

Artificial Intelligence Machine learning and Deep Learning: Trends and Tools. Dr. Shaona

Artificial Intelligence Machine learning and Deep Learning: Trends and Tools. Dr. Shaona Artificial Intelligence Machine learning and Deep Learning: Trends and Tools Dr. Shaona Ghosh @shaonaghosh What is Machine Learning? Computer algorithms that learn patterns in data automatically from large

More information

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks Attention-based Multi-Encoder-Decoder Recurrent Neural Networks Stephan Baier 1, Sigurd Spieckermann 2 and Volker Tresp 1,2 1- Ludwig Maximilian University Oettingenstr. 67, Munich, Germany 2- Siemens

More information

PROJECT REPORT. Using Deep Learning to Classify Malignancy Associated Changes

PROJECT REPORT. Using Deep Learning to Classify Malignancy Associated Changes Using Deep Learning to Classify Malignancy Associated Changes Hakan Wieslander, Gustav Forslid Project in Computational Science: Report January 2017 PROJECT REPORT Department of Information Technology

More information

arxiv: v1 [cs.lg] 2 Jan 2018

arxiv: v1 [cs.lg] 2 Jan 2018 Deep Learning for Identifying Potential Conceptual Shifts for Co-creative Drawing arxiv:1801.00723v1 [cs.lg] 2 Jan 2018 Pegah Karimi pkarimi@uncc.edu Kazjon Grace The University of Sydney Sydney, NSW 2006

More information