Landmark Recognition with Deep Learning
|
|
- Kory Jordan
- 5 years ago
- Views:
Transcription
1 Landmark Recognition with Deep Learning PROJECT LABORATORY submitted by Filippo Galli NEUROSCIENTIFIC SYSTEM THEORY Technische Universität München Prof. Dr Jörg Conradt Supervisor: Marcello Mulas, PhD Final Submission:
2
3 Technische Universität München Neurowissenschaftliche Systemtheorie Project Practical Filippo Galli Landmark recognition with deep learning 05-Oct-2015 Problem description: Autonomous robotic navigation is based on the development of simultaneous localization and mapping (SLAM) strategies. In general, SLAM algorithms require the integration of odometric information with location specific sensory information. In comparison with human performance, the recognition of visual landmarks is still a problem that does not have a satisfying solution yet. However, recent developments in machine learning [1] seem promising in order to improve robotic recognition skills. In fact, deep learning techniques are currently successfully applied to several problems of visual classification. Task: The primary goal of the students is to use an existing deep learning toolbox [2,3] to recognize landmarks in an indoor environment. The images will be recorded by an on-board camera mounted on top of a mobile robot. In order to reach this goal the students shall: identify potential landmarks to recognize record a training set of images of the landmarks train a deep neural network using an available toolbox (Caffe [2] or Theano [3]) classify visible landmarks in a video recorded by a mobile robot while exploring the indoor environment compare and evaluate the performance of different deep learning networks write a report that includes a description of the work done and a summary of the most relevant obtained results. Bibliography: [1] Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comp. 18, (2006) [2] Caffe, [3] Theano, Supervisor: Marcello Mulas (Jörg Conradt) Professor
4
5 Abstract In autonomous robot navigation Simultaneous Localization and Mapping is achieved through integration of odometric and location specific informations. Among the latters, recognition of specific landmarks could be a promising choice, given latest advancements in machine learning techniques. In this report is presented a possible solution for visual landmark recognition, achieved by application of deep learning techniques to video recorded images. In particular, convolutional networks have been used for detection and classification of images containing multiple objects, thanks to open source machine learning libraries.
6 2
7 CONTENTS 3 Contents 1 Introduction 4 2 Theory Convolutional Neural Networks Logistic Regression Layer Implementation Network architecture Database Image pre-processing Results Single object testing Multiple objects with manual framing Multiple objects with automatic framing Conclusion 17 List of Figures 18 Bibliography 19
8 4 CHAPTER 1. INTRODUCTION Chapter 1 Introduction The ability to recognize landmarks in space could be a solution for retrieving local specific informations, thus allowing new possibilities for Simultaneous Localization and Mapping (SLAM) in the field of autonomous robot navigation. Given recent progresses in deep learning, a promising approach consists in the use of Neural Networks for image recognition, in particular adopting Convolutional Neural Networks [3]. The idea is to train a network for classifying objects depicted on images. In this project, this is achieved by the analysis of frames streamed by a webcam mounted on top of a mobile robot. In the following chapters the problem is tackled as follows: 1. Theory: an overview on what Convolutional Neural Networks are, how they work, and what differs from other kinds of Neural networks 2. Implementation: available toolboxes for machine learning algorithms, chosen network architecture, how the database has been built and how images have been pre-processed to make the project working efficiently 3. Results: what was achieved in terms of image recognition 4. Conclusions: Analysis on what didn t work and where to concentrate future efforts.
9 5 Chapter 2 Theory This chapter will shortly introduce the reader to the theory behind the adopted techniques. As mentioned, one of the most high performance network architecture for image recognition and classification is the convolutional network. Being part of the family of deep learning algorithms, its brightest side lays on the ability of abstracting characteristic features of inputs. Differently from the Single or Multi Layer Perceptron (MLP), Convolutional Neural Networks (CNN) exploit spatial correlation, with units being tiled in such a way that in adjacent layers, neurons react not depending on all the units of the previous layer, but only on a subset of them. This choice allows a more deep resemblance with how animal brain processes images. In fact, cells in our visual cortex are sensitive to small sub-regions of the visual field, called receptive fields, and they act like filters, processing the input space, i.e., what we actually see, exploiting the strong spacial local correlation present in natural images. 2.1 Convolutional Neural Networks The keypoint in CNN is then that information is not retrieved only from the very unit, but also by its position among the others. This is achieved by bulding a network based on the following concepts: Local receptive fields: In MLPs, units in every layer are fully connected with the units in the previous layer. In CNN to every unit is assigned a subset of them, the local receptive field, as shown in 2.1. Shared weights In CNN weights and biases in a layer are the same set for all neurons, meaning that for a single layer every neuron assigns to the units in his receptive field
10 6 CHAPTER 2. THEORY Figure 2.1: Local receptive field: How neurons are connected in different layers [4] the same set of weights. A possible interpretation of this could be that every neuron in a single layer is excited and detects the same feature. This kind of layer is then also called feature map and constitute the convolutional layer. Usually many feature maps are organized in the network, working on the same input, in order to detect different features. Pooling Usually, after convolutional layers are present pooling layers. One of the most common kind is the max-pooling layer, in which for every feature map, information is condensed in smaller layers. For instance a max-pooling unit may keep the maximum activation in a 2x2 subregion of the feature map and output that one only value. By doing that, computational efforts are strongly minimized without a great loss of information, since the position of the most evident feature in that region is preserved. 2.2 Logistic Regression Layer All the information coming from convolutional layers, condensed by the max-pooling layers, need to be elaborated to get an output, i.e., the class which the object represented in the input image belongs to. A clever way to do that is to add a Logistic Regression layer. The Logistic regressor is a probabilistic linear classifier where all units are fully connected with the units of previous layer, meaning that there s no local receptive field or subregions, but
11 2.2. LOGISTIC REGRESSION LAYER 7 every unit output is function of all the adjacent layers units output and weight. In detail, the probability that an input vector x is a member of a class i, an outcome of a stochastic variable Y, can be written as: P (Y = i x, W, b) = softmax i (W x + b) = exp(w ix + b i ) exp(w j x + b j ) j (2.1) Where W is the weight matrix and b is the weight vector. Hence the model prediction is the class whose probability is maximal: y p red = argmax i P (Y = i x, W, b) (2.2)
12 8 CHAPTER 3. IMPLEMENTATION Chapter 3 Implementation Of the two available toolboxes suggested for the implementation, i.e. Theano ([1] and [2]) 1 and Caffé 2, the former was chosen. On the one hand Caffé is allegedly the fastest library for implementing machine learning algorithms and is written in C. Theano, on the other hand is a Python library, allowing fastest prototyping and testing, in spite of less computational speed, which was anyway not the main topic of the project. Moreover, the Theano framework was handled through the use of Lasagne, a lightweight Python library for building and training neural networks. 3.1 Network architecture The tested architecture for the convolutional network was built in the following way: Input Layer The input layer has been built as a square of 80x80 units. In fact, this will be the size of the input image in terms of pixels. Moreover, this images are coded as one-channel input, meaning that are described by greyscale levels, instead of RGB values. These choices will be motivated in the following paragraphs. Hidden Convolutional Layer This convolutional layer defines 32 feature maps. Their local receptive field is a square of 5x5 pixels. Right after the feature maps a max-poolinng layer is set, with a pool size of 2x2 pixels. Hidden Convolutional Layer Consequently, another convolutional layer is added with the same parameters: 32 feature maps, with a 5x5 local receptive field followed by a max-pooling layer with 2x2 pool size
13 3.2. DATABASE 9 Figure 3.1: Four classes. From top to bottom, from left to right: watering can, plant, bookcase, table Fully-connected Hidden Layer A layer with 256 units fully connected with the the previous layer. Logistic Regression Output Layer A 6 unit fully connected Logistic regressor for obtaining the output of the network, one unit for each class. 3.2 Database Being tested in the corridor of an academic department, the trained network should at last recognize some typical landmarks characteric of the ambient. Though the choice was arbitrary, some technological constraints rose: Camera contraints: The video-recording camera placed on the mobile robot was able to get images of size 640x480 pixels, taken from almost ground level, thus restricting the choice of landmarks to small and floor-level objects. Class information content: The choice, for instance, of a white wall as landmark would have lead to a class with a very limited amount of information, since a white wall in most cases was also the background of the other object images. The choice then fell on the following object classes: table, fire extinguisher, bookcase, watering can, plant and trash bin. In Fig. 3.1 are shown some examples of the raw images. In order to train a network able to recognize objects in different conditions with respect to the database, images have been taken with different background condition
14 10 CHAPTER 3. IMPLEMENTATION Figure 3.2: Different images of the same object: trash bin Figure 3.3: Multiple object images and different object orientation. In Fig. 3.2 is shown how the attempt to train the network to focus only on the invariances of an object reflects on the database. The database, containig 3000 images is then subdivided in 3 sub-datasets for training, validation and testing. Some more images containing multiple objects have been taken for demonstration purposes and are represented in Fig Image pre-processing Images recorded by the camera on the mobile robot are RGB files of dimension 640x480 pixels. Pre-processing was need for avoiding: Memory overload: Since in Theano and Lasagne all data have to be loaded
15 3.3. IMAGE PRE-PROCESSING 11 Figure 3.4: Comparison of informations contained in the luminance plane (top right), and in the chrominance plane (bottom row) with respect to the original image (top left). at the same time during training, and the database consisting of 3000 images, the machine on which the project was carried out ran out of memory, making it impossible to complete the training of the network. Time consuption: By using raw images the CNN would have required way more time to complete training, thus slowing the testing of the code, with no actual benefit, since lots of informations contained in the images are no help for the classification problem. The proposed solution to these problems requires two steps of image manipulation. First the object contained in the image was cropped, and only a subset of pixels was kept for training. From the original image only a square of variable size (400x400 for the table class, 300x300 for the others) was obtained and a following resizing to 80x80 was applied. Secondly, the image was turned from a RGB to a greyscale file, factorizing by 3 the efforts required by the network for training, yet keeping a vast amount of information. In fact, as shown in Fig. 3.4, the most of it is contained in luminance and not in chrominance. The result of the two steps of image manipulation can be seen in Fig. 3.5.
16 12 CHAPTER 3. IMPLEMENTATION Figure 3.5: Fire extinguisher after applying cropping, resizing, and greyscaling
17 13 Chapter 4 Results Training the network for 100 epochs produced the following results. In Fig. 4.1 is shown the trend of validation accuracy, i.e. the percentage of samples in the validation set assigned to the correct class. It rises rapidly to %, and then saturates, never reaching 100 %. Other tests have been tried for clearing network performances: Single object testing Multiple object with manual framing Multiple object with automatic framing 4.1 Single object testing A manual testing of the actual performace of the CNN consisted in asking the model for prediction on single images, after training and saving model parameters. Fed images are shown in Fig. 4.2 and the results in Fig. 4.3 Figure 4.1: Validation accuracy (%) for every epoch during training
18 14 CHAPTER 4. RESULTS Figure 4.2: Images fed as input for manual testing. From left to right, top to bottom: 405.jpg, 966.jpg, 1426.jpg, 1910.jpg, 2436.jpg, 2593.jpg Figure 4.3: Results of classification for images represented in Fig. 4.2
19 4.2. MULTIPLE OBJECTS WITH MANUAL FRAMING 15 Figure 4.4: Single objects cropped from the top row images of Fig. 2.jpg, 3.jpg, 4.jpg 3.3: 1.jpg, Figure 4.5: Classification results of images in Fig Multiple objects with manual framing From the top row images showed in Fig.3.3 containing multiple objects, single object images have been obtained and are represented in Fig.4.4.Results are shown in Fig Three out of four images have been classified correctly. The CNN failed on the image of a plant, where the great part of the plant is actually cut off the photo, and basically the prediction has been made only from the vase, mistaken for the trash bin. 4.3 Multiple objects with automatic framing This test required the trained CNN to check a multiple objects image for finding and labeling them. This task was approched by making a squared frame run through the image, and by saving for each class the highest confidence level retrieved among all squared frames. After this process, if a class showed a confidence level higher than an a-priori value set to 99.4%, the corresponding object coordinates are framed and labeled accordingly. Since this euristic would have required a lot of time for a manual efficient determination, this test did not reached great results. Moreover, objects standing on the sides of the image clearly introduce a further complication, since they cannot be centered in the frame, as the network would espect from the training dataset.
20 16 CHAPTER 4. RESULTS Figure 4.6: Multiple object on-image-labeling with automatic framing: detected objects are framed and labeled on the the top left corner of the square Nevertheless, on the bottom row image of Fig 3.3 a satisfying result was obtained, and showed on Fig.4.6.
21 17 Chapter 5 Conclusion As seen in 4, object classification through CNN has had some interesting results. In fact, once the model is trained, making predictions on new images does not require a lot of computational power and at the same time shows, under some constraints, good performances. One of the most tricky issues showed to be the fact that when dealing with multiple objects contained in the same image, which is actually a real world scenario, automatic framing can visibly lower performance, due to the following reasons: Objects cannot always be centered in a squared frame, for instance if they stand by the sides of the picture. As it is now, the algorithm used for automatic framing lacks of the characteristic to be scaling invariant, meaning that the dimension of the object matters and may lead to failed classification or recognition. Moreover, from some objects, like for instance the table, it is difficult to extract features invariant from orientation, and this can complicate the task of recognizing the object itself from different points of view. Hence, results evident that a great enphases should be put on the building of the database and pre-processing, for solving current issues and enhancing preformances. At last, since the vast majority of the time spent on the project was dedicated to get the CNN working, no chance to adopt and try different architectures was possible.
22 18 LIST OF FIGURES List of Figures 2.1 Local receptive field: How neurons are connected in different layers Four classes. From top to bottom, from left to right: watering can, plant, bookcase, table Different images of the same object: trash bin Multiple object images Comparison of informations contained in the luminance plane (top right), and in the chrominance plane (bottom row) with respect to the original image (top left) Fire extinguisher after applying cropping, resizing, and greyscaling Validation accuracy (%) for every epoch during training Images fed as input for manual testing. From left to right, top to bottom: 405.jpg, 966.jpg, 1426.jpg, 1910.jpg, 2436.jpg, 2593.jpg Results of classification for images represented in Fig Single objects cropped from the top row images of Fig. 3.3: 1.jpg, 2.jpg, 3.jpg, 4.jpg Classification results of images in Fig Multiple object on-image-labeling with automatic framing: detected objects are framed and labeled on the the top left corner of the square 16
23 BIBLIOGRAPHY 19 Bibliography [1] Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, [2] James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. Theano: a CPU and GPU math expression compiler, June Oral Presentation. [3] Yann LeCun and Yoshua Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), [4] Michael A. Nielsen. Neural Networks and Deep Learning. Determination Press, 2005.
24 20 BIBLIOGRAPHY License This work is licensed under the Creative Commons Attribution 3.0 Germany License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.
Coursework 2. MLP Lecture 7 Convolutional Networks 1
Coursework 2 MLP Lecture 7 Convolutional Networks 1 Coursework 2 - Overview and Objectives Overview: Use a selection of the techniques covered in the course so far to train accurate multi-layer networks
More informationAttention-based Multi-Encoder-Decoder Recurrent Neural Networks
Attention-based Multi-Encoder-Decoder Recurrent Neural Networks Stephan Baier 1, Sigurd Spieckermann 2 and Volker Tresp 1,2 1- Ludwig Maximilian University Oettingenstr. 67, Munich, Germany 2- Siemens
More informationIntroduction to Machine Learning
Introduction to Machine Learning Deep Learning Barnabás Póczos Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey Hinton Yann LeCun 2
More informationGenerating an appropriate sound for a video using WaveNet.
Australian National University College of Engineering and Computer Science Master of Computing Generating an appropriate sound for a video using WaveNet. COMP 8715 Individual Computing Project Taku Ueki
More informationNumber Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices
J Inf Process Syst, Vol.12, No.1, pp.100~108, March 2016 http://dx.doi.org/10.3745/jips.04.0022 ISSN 1976-913X (Print) ISSN 2092-805X (Electronic) Number Plate Detection with a Multi-Convolutional Neural
More informationResearch on Hand Gesture Recognition Using Convolutional Neural Network
Research on Hand Gesture Recognition Using Convolutional Neural Network Tian Zhaoyang a, Cheng Lee Lung b a Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China E-mail address:
More informationarxiv: v1 [cs.ce] 9 Jan 2018
Predict Forex Trend via Convolutional Neural Networks Yun-Cheng Tsai, 1 Jun-Hao Chen, 2 Jun-Jie Wang 3 arxiv:1801.03018v1 [cs.ce] 9 Jan 2018 1 Center for General Education 2,3 Department of Computer Science
More informationAttention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks
Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks Stephan Baier1, Sigurd Spieckermann2 and Volker Tresp1,2 1- Ludwig Maximilian University Oettingenstr. 67, Munich,
More informationIntroduction to Machine Learning
Introduction to Machine Learning Perceptron Barnabás Póczos Contents History of Artificial Neural Networks Definitions: Perceptron, Multi-Layer Perceptron Perceptron algorithm 2 Short History of Artificial
More informationBiologically Inspired Computation
Biologically Inspired Computation Deep Learning & Convolutional Neural Networks Joe Marino biologically inspired computation biological intelligence flexible capable of detecting/ executing/reasoning about
More informationImage Extraction using Image Mining Technique
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 9 (September. 2013), V2 PP 36-42 Image Extraction using Image Mining Technique Prof. Samir Kumar Bandyopadhyay,
More informationCSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning Fall 2018/19 6. Convolutional Neural Networks (Some figures adapted from NNDL book) 1 Convolution Neural Networks 1. Convolutional Neural Networks Convolution,
More informationConvolutional Neural Networks: Real Time Emotion Recognition
Convolutional Neural Networks: Real Time Emotion Recognition Bruce Nguyen, William Truong, Harsha Yeddanapudy Motivation: Machine emotion recognition has long been a challenge and popular topic in the
More informationGPU ACCELERATED DEEP LEARNING WITH CUDNN
GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION
More informationLesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.
Lesson 08 Convolutional Neural Network Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni Lesson 08 Convolution we will consider 2D convolution the result
More informationAn Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland
An Introduction to Convolutional Neural Networks Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland Sources & Resources - Andrej Karpathy, CS231n http://cs231n.github.io/convolutional-networks/
More informationRadio Deep Learning Efforts Showcase Presentation
Radio Deep Learning Efforts Showcase Presentation November 2016 hume@vt.edu www.hume.vt.edu Tim O Shea Senior Research Associate Program Overview Program Objective: Rethink fundamental approaches to how
More informationSIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB
SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB S. Kajan, J. Goga Institute of Robotics and Cybernetics, Faculty of Electrical Engineering and Information Technology, Slovak University
More informationCLASSLESS ASSOCIATION USING NEURAL NETWORKS
Workshop track - ICLR 1 CLASSLESS ASSOCIATION USING NEURAL NETWORKS Federico Raue 1,, Sebastian Palacio, Andreas Dengel 1,, Marcus Liwicki 1 1 University of Kaiserslautern, Germany German Research Center
More informationDeep Learning. Dr. Johan Hagelbäck.
Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:
More informationImage Manipulation Detection using Convolutional Neural Network
Image Manipulation Detection using Convolutional Neural Network Dong-Hyun Kim 1 and Hae-Yeoun Lee 2,* 1 Graduate Student, 2 PhD, Professor 1,2 Department of Computer Software Engineering, Kumoh National
More informationCROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen
CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS Kuan-Chuan Peng and Tsuhan Chen Cornell University School of Electrical and Computer Engineering Ithaca, NY 14850
More information11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO
Introduction to RNNs for NLP SHANG GAO About Me PhD student in the Data Science and Engineering program Took Deep Learning last year Work in the Biomedical Sciences, Engineering, and Computing group at
More informationDETECTION AND RECOGNITION OF HAND GESTURES TO CONTROL THE SYSTEM APPLICATIONS BY NEURAL NETWORKS. P.Suganya, R.Sathya, K.
Volume 118 No. 10 2018, 399-405 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: 10.12732/ijpam.v118i10.40 ijpam.eu DETECTION AND RECOGNITION OF HAND GESTURES
More informationImage Finder Mobile Application Based on Neural Networks
Image Finder Mobile Application Based on Neural Networks Nabil M. Hewahi Department of Computer Science, College of Information Technology, University of Bahrain, Sakheer P.O. Box 32038, Kingdom of Bahrain
More informationDiscriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks
Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks Emad M. Grais, Gerard Roma, Andrew J.R. Simpson, and Mark D. Plumbley Centre for Vision, Speech and Signal
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to publication record in Explore Bristol Research PDF-document
Hepburn, A., McConville, R., & Santos-Rodriguez, R. (2017). Album cover generation from genre tags. Paper presented at 10th International Workshop on Machine Learning and Music, Barcelona, Spain. Peer
More informationConvolutional Networks Overview
Convolutional Networks Overview Sargur Srihari 1 Topics Limitations of Conventional Neural Networks The convolution operation Convolutional Networks Pooling Convolutional Network Architecture Advantages
More informationClassifying the Brain's Motor Activity via Deep Learning
Final Report Classifying the Brain's Motor Activity via Deep Learning Tania Morimoto & Sean Sketch Motivation Over 50 million Americans suffer from mobility or dexterity impairments. Over the past few
More informationImproving reverberant speech separation with binaural cues using temporal context and convolutional neural networks
Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,
More informationarxiv: v3 [cs.cv] 18 Dec 2018
Video Colorization using CNNs and Keyframes extraction: An application in saving bandwidth Ankur Singh 1 Anurag Chanani 2 Harish Karnick 3 arxiv:1812.03858v3 [cs.cv] 18 Dec 2018 Abstract In this paper,
More informationConvolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3
Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 1 Olaf Ronneberger, Philipp Fischer, Thomas Brox (Freiburg, Germany) 2 Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (POSTECH,
More informationMINE 432 Industrial Automation and Robotics
MINE 432 Industrial Automation and Robotics Part 3, Lecture 5 Overview of Artificial Neural Networks A. Farzanegan (Visiting Associate Professor) Fall 2014 Norman B. Keevil Institute of Mining Engineering
More informationComparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning
Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning Lars Hertel, Huy Phan and Alfred Mertins Institute for Signal Processing, University of Luebeck, Germany Graduate School
More informationAutomated Planetary Terrain Mapping of Mars Using Image Pattern Recognition
Automated Planetary Terrain Mapping of Mars Using Image Pattern Recognition Design Document Version 2.0 Team Strata: Sean Baquiro Matthew Enright Jorge Felix Tsosie Schneider 2 Table of Contents 1 Introduction.3
More informationA Neural Algorithm of Artistic Style (2015)
A Neural Algorithm of Artistic Style (2015) Leon A. Gatys, Alexander S. Ecker, Matthias Bethge Nancy Iskander (niskander@dgp.toronto.edu) Overview of Method Content: Global structure. Style: Colours; local
More informationColorful Image Colorizations Supplementary Material
Colorful Image Colorizations Supplementary Material Richard Zhang, Phillip Isola, Alexei A. Efros {rich.zhang, isola, efros}@eecs.berkeley.edu University of California, Berkeley 1 Overview This document
More informationECE 599/692 Deep Learning Lecture 19 Beyond BP and CNN
ECE 599/692 Deep Learning Lecture 19 Beyond BP and CNN Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationWadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology
ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 1) Available online at www.ijariit.com Hand Detection and Gesture Recognition in Real-Time Using Haar-Classification and Convolutional Neural Networks
More informationMobile Cognitive Indoor Assistive Navigation for the Visually Impaired
1 Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired Bing Li 1, Manjekar Budhai 2, Bowen Xiao 3, Liang Yang 1, Jizhong Xiao 1 1 Department of Electrical Engineering, The City College,
More informationDeep Neural Network Architectures for Modulation Classification
Deep Neural Network Architectures for Modulation Classification Xiaoyu Liu, Diyu Yang, and Aly El Gamal School of Electrical and Computer Engineering Purdue University Email: {liu1962, yang1467, elgamala}@purdue.edu
More informationMLP for Adaptive Postprocessing Block-Coded Images
1450 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 8, DECEMBER 2000 MLP for Adaptive Postprocessing Block-Coded Images Guoping Qiu, Member, IEEE Abstract A new technique
More informationTiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems
Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Emeric Stéphane Boigné eboigne@stanford.edu Jan Felix Heyse heyse@stanford.edu Abstract Scaling
More informationCONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET
CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET MOTIVATION Fully connected neural network Example 1000x1000 image 1M hidden units 10 12 (= 10 6 10 6 ) parameters! Observation
More informationAn Hybrid MLP-SVM Handwritten Digit Recognizer
An Hybrid MLP-SVM Handwritten Digit Recognizer A. Bellili ½ ¾ M. Gilloux ¾ P. Gallinari ½ ½ LIP6, Université Pierre et Marie Curie ¾ La Poste 4, Place Jussieu 10, rue de l Ile Mabon, BP 86334 75252 Paris
More informationarxiv: v1 [cs.lg] 2 Jan 2018
Deep Learning for Identifying Potential Conceptual Shifts for Co-creative Drawing arxiv:1801.00723v1 [cs.lg] 2 Jan 2018 Pegah Karimi pkarimi@uncc.edu Kazjon Grace The University of Sydney Sydney, NSW 2006
More informationAutomated hand recognition as a human-computer interface
Automated hand recognition as a human-computer interface Sergii Shelpuk SoftServe, Inc. sergii.shelpuk@gmail.com Abstract This paper investigates applying Machine Learning to the problem of turning a regular
More informationConvolutional Neural Network-based Steganalysis on Spatial Domain
Convolutional Neural Network-based Steganalysis on Spatial Domain Dong-Hyun Kim, and Hae-Yeoun Lee Abstract Steganalysis has been studied to detect the existence of hidden messages by steganography. However,
More informationRecognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83
Recognition: Overview Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Textbook This book has a lot of material: K. Grauman and B. Leibe Visual Object Recognition Synthesis Lectures On Computer
More informationEmbedding Artificial Intelligence into Our Lives
Embedding Artificial Intelligence into Our Lives Michael Thompson, Synopsys D&R IP-SOC DAYS Santa Clara April 2018 1 Agenda Introduction What AI is and is Not Where AI is being used Rapid Advance of AI
More informationImpact of Automatic Feature Extraction in Deep Learning Architecture
Impact of Automatic Feature Extraction in Deep Learning Architecture Fatma Shaheen, Brijesh Verma and Md Asafuddoula Centre for Intelligent Systems Central Queensland University, Brisbane, Australia {f.shaheen,
More informationAdversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at Quora,
Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Presentation at Quora, 2016-08-04 In this presentation Intriguing Properties of Neural Networks Szegedy et al, 2013
More informationLecture 17 Convolutional Neural Networks
Lecture 17 Convolutional Neural Networks 30 March 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/22 Notes: Problem set 6 is online and due next Friday, April 8th Problem sets 7,8, and 9 will be due
More informationINFORMATION about image authenticity can be used in
1 Constrained Convolutional Neural Networs: A New Approach Towards General Purpose Image Manipulation Detection Belhassen Bayar, Student Member, IEEE, and Matthew C. Stamm, Member, IEEE Abstract Identifying
More informationGESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING
2017 NDIA GROUND VEHICLE SYSTEMS ENGINEERING AND TECHNOLOGY SYMPOSIUM AUTONOMOUS GROUND SYSTEMS (AGS) TECHNICAL SESSION AUGUST 8-10, 2017 - NOVI, MICHIGAN GESTURE RECOGNITION FOR ROBOTIC CONTROL USING
More informationEMERGENCE OF FOVEAL IMAGE SAMPLING FROM
EMERGENCE OF FOVEAL IMAGE SAMPLING FROM LEARNING TO ATTEND IN VISUAL SCENES Brian Cheung, Eric Weiss, Bruno Olshausen Redwood Center UC Berkeley {bcheung,eaweiss,baolshausen}@berkeley.edu ABSTRACT We describe
More informationINTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013
INTRODUCTION TO DEEP LEARNING Steve Tjoa kiemyang@gmail.com June 2013 Acknowledgements http://ufldl.stanford.edu/wiki/index.php/ UFLDL_Tutorial http://youtu.be/ayzoubkuf3m http://youtu.be/zmnoatzigik 2
More informationCHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION
CHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION Chapter 7 introduced the notion of strange circles: using various circles of musical intervals as equivalence classes to which input pitch-classes are assigned.
More informationIntelligent Non-Player Character with Deep Learning. Intelligent Non-Player Character with Deep Learning 1
Intelligent Non-Player Character with Deep Learning Meng Zhixiang, Zhang Haoze Supervised by Prof. Michael Lyu CUHK CSE FYP Term 1 Intelligent Non-Player Character with Deep Learning 1 Intelligent Non-Player
More informationAdversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at HORSE 2016 London,
Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Presentation at HORSE 2016 London, 2016-09-19 In this presentation Intriguing Properties of Neural Networks Szegedy
More informationArtificial Intelligence: Using Neural Networks for Image Recognition
Kankanahalli 1 Sri Kankanahalli Natalie Kelly Independent Research 12 February 2010 Artificial Intelligence: Using Neural Networks for Image Recognition Abstract: The engineering goals of this experiment
More informationVehicle Color Recognition using Convolutional Neural Network
Vehicle Color Recognition using Convolutional Neural Network Reza Fuad Rachmadi and I Ketut Eddy Purnama Multimedia and Network Engineering Department, Institut Teknologi Sepuluh Nopember, Keputih Sukolilo,
More informationConvolutional neural networks
Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions
More informationRobust Hand Gesture Recognition for Robotic Hand Control
Robust Hand Gesture Recognition for Robotic Hand Control Ankit Chaudhary Robust Hand Gesture Recognition for Robotic Hand Control 123 Ankit Chaudhary Department of Computer Science Northwest Missouri State
More informationDecoding Brainwave Data using Regression
Decoding Brainwave Data using Regression Justin Kilmarx: The University of Tennessee, Knoxville David Saffo: Loyola University Chicago Lucien Ng: The Chinese University of Hong Kong Mentor: Dr. Xiaopeng
More informationPlaying Atari Games with Deep Reinforcement Learning
Playing Atari Games with Deep Reinforcement Learning 1 Playing Atari Games with Deep Reinforcement Learning Varsha Lalwani (varshajn@iitk.ac.in) Masare Akshay Sunil (amasare@iitk.ac.in) IIT Kanpur CS365A
More informationOn Intelligence Jeff Hawkins
On Intelligence Jeff Hawkins Chapter 8: The Future of Intelligence April 27, 2006 Presented by: Melanie Swan, Futurist MS Futures Group 650-681-9482 m@melanieswan.com http://www.melanieswan.com Building
More informationPreprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition
Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad Road, Rajkot Gujarat, India C. K. Kumbharana,
More informationCOMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES
International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 9, Issue 3, May - June 2018, pp. 177 185, Article ID: IJARET_09_03_023 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=9&itype=3
More informationTransactions on Information and Communications Technologies vol 1, 1993 WIT Press, ISSN
Combining multi-layer perceptrons with heuristics for reliable control chart pattern classification D.T. Pham & E. Oztemel Intelligent Systems Research Laboratory, School of Electrical, Electronic and
More informationDistance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks
Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,
More informationDerek Allman a, Austin Reiter b, and Muyinatu Bell a,c
Exploring the effects of transducer models when training convolutional neural networks to eliminate reflection artifacts in experimental photoacoustic images Derek Allman a, Austin Reiter b, and Muyinatu
More informationHandling Emotions in Human-Computer Dialogues
Handling Emotions in Human-Computer Dialogues Johannes Pittermann Angela Pittermann Wolfgang Minker Handling Emotions in Human-Computer Dialogues ABC Johannes Pittermann Universität Ulm Inst. Informationstechnik
More informationCHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF
95 CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF 6.1 INTRODUCTION An artificial neural network (ANN) is an information processing model that is inspired by biological nervous systems
More informationApplication of Multi Layer Perceptron (MLP) for Shower Size Prediction
Chapter 3 Application of Multi Layer Perceptron (MLP) for Shower Size Prediction 3.1 Basic considerations of the ANN Artificial Neural Network (ANN)s are non- parametric prediction tools that can be used
More informationarxiv: v1 [cs.lg] 30 May 2016
Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent Timothy J O Shea and T. Charles Clancy Virginia Polytechnic Institute and State University arxiv:1605.09221v1
More informationThe Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification
Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Parallel to AIMA 8., 8., 8.6.3, 8.9 The Automatic Classification Problem Assign object/event or sequence of objects/events
More informationProposers Day Workshop
Proposers Day Workshop Monday, January 23, 2017 @srcjump, #JUMPpdw Cognitive Computing Vertical Research Center Mandy Pant Academic Research Director Intel Corporation Center Motivation Today s deep learning
More informationLibyan Licenses Plate Recognition Using Template Matching Method
Journal of Computer and Communications, 2016, 4, 62-71 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.47009 Libyan Licenses Plate Recognition Using
More informationA Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures
A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures D.M. Rojas Castro, A. Revel and M. Ménard * Laboratory of Informatics, Image and Interaction (L3I)
More informationScrabble Board Automatic Detector for Third Party Applications
Scrabble Board Automatic Detector for Third Party Applications David Hirschberg Computer Science Department University of California, Irvine hirschbd@uci.edu Abstract Abstract Scrabble is a well-known
More informationComparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics
University of Arkansas, Fayetteville ScholarWorks@UARK Computer Science and Computer Engineering Undergraduate Honors Theses Computer Science and Computer Engineering 5-2018 Comparison of Google Image
More informationEnhancing Symmetry in GAN Generated Fashion Images
Enhancing Symmetry in GAN Generated Fashion Images Vishnu Makkapati 1 and Arun Patro 2 1 Myntra Designs Pvt. Ltd., Bengaluru - 560068, India vishnu.makkapati@myntra.com 2 Department of Electrical Engineering,
More informationTOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey
TOOLS AND PROCESSORS FOR COMPUTER VISION Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey 1 EXECUTIVE SUMMARY Since 2015, the Embedded Vision Alliance has
More informationMultimedia Forensics
Multimedia Forensics Using Mathematics and Machine Learning to Determine an Image's Source and Authenticity Matthew C. Stamm Multimedia & Information Security Lab (MISL) Department of Electrical and Computer
More informationCreating an Agent of Doom: A Visual Reinforcement Learning Approach
Creating an Agent of Doom: A Visual Reinforcement Learning Approach Michael Lowney Department of Electrical Engineering Stanford University mlowney@stanford.edu Robert Mahieu Department of Electrical Engineering
More informationDESIGN & DEVELOPMENT OF COLOR MATCHING ALGORITHM FOR IMAGE RETRIEVAL USING HISTOGRAM AND SEGMENTATION TECHNIQUES
International Journal of Information Technology and Knowledge Management July-December 2011, Volume 4, No. 2, pp. 585-589 DESIGN & DEVELOPMENT OF COLOR MATCHING ALGORITHM FOR IMAGE RETRIEVAL USING HISTOGRAM
More informationNorsk Regnesentral (NR) Norwegian Computing Center
Norsk Regnesentral (NR) Norwegian Computing Center Petter Abrahamsen Joining Forces 2018 www.nr.no NUSSE: - 512 9-digit numbers - 200 additions/second Our latest servers: - Four Titan X GPUs - 14 336 cores
More informationChapter 2 Transformation Invariant Image Recognition Using Multilayer Perceptron 2.1 Introduction
Chapter 2 Transformation Invariant Image Recognition Using Multilayer Perceptron 2.1 Introduction A multilayer perceptron (MLP) [52, 53] comprises an input layer, any number of hidden layers and an output
More informationResearch on Application of Conjoint Neural Networks in Vehicle License Plate Recognition
International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 11, Number 10 (2018), pp. 1499-1510 International Research Publication House http://www.irphouse.com Research on Application
More informationENHANCED HUMAN-AGENT INTERACTION: AUGMENTING INTERACTION MODELS WITH EMBODIED AGENTS BY SERAFIN BENTO. MASTER OF SCIENCE in INFORMATION SYSTEMS
BY SERAFIN BENTO MASTER OF SCIENCE in INFORMATION SYSTEMS Edmonton, Alberta September, 2015 ABSTRACT The popularity of software agents demands for more comprehensive HAI design processes. The outcome of
More informationA.I in Automotive? Why and When.
A.I in Automotive? Why and When. AGENDA 01 02 03 04 Definitions A.I? A.I in automotive Now? Next big A.I breakthrough in Automotive 01 DEFINITIONS DEFINITIONS Artificial Intelligence Artificial Intelligence:
More informationES 492: SCIENCE IN THE MOVIES
UNIVERSITY OF SOUTH ALABAMA ES 492: SCIENCE IN THE MOVIES LECTURE 5: ROBOTICS AND AI PRESENTER: HANNAH BECTON TODAY'S AGENDA 1. Robotics and Real-Time Systems 2. Reacting to the environment around them
More informationCS 7643: Deep Learning
CS 7643: Deep Learning Topics: Toeplitz matrices and convolutions = matrix-mult Dilated/a-trous convolutions Backprop in conv layers Transposed convolutions Dhruv Batra Georgia Tech HW1 extension 09/22
More informationCSC321 Lecture 11: Convolutional Networks
CSC321 Lecture 11: Convolutional Networks Roger Grosse Roger Grosse CSC321 Lecture 11: Convolutional Networks 1 / 35 Overview What makes vision hard? Vison needs to be robust to a lot of transformations
More informationFACE RECOGNITION USING NEURAL NETWORKS
Int. J. Elec&Electr.Eng&Telecoms. 2014 Vinoda Yaragatti and Bhaskar B, 2014 Research Paper ISSN 2319 2518 www.ijeetc.com Vol. 3, No. 3, July 2014 2014 IJEETC. All Rights Reserved FACE RECOGNITION USING
More informationClassification for Motion Game Based on EEG Sensing
Classification for Motion Game Based on EEG Sensing Ran WEI 1,3,4, Xing-Hua ZHANG 1,4, Xin DANG 2,3,4,a and Guo-Hui LI 3 1 School of Electronics and Information Engineering, Tianjin Polytechnic University,
More informationReinforcement Learning Agent for Scrolling Shooter Game
Reinforcement Learning Agent for Scrolling Shooter Game Peng Yuan (pengy@stanford.edu) Yangxin Zhong (yangxin@stanford.edu) Zibo Gong (zibo@stanford.edu) 1 Introduction and Task Definition 1.1 Game Agent
More informationDiet Networks: Thin Parameters for Fat Genomics
Institut des algorithmes d apprentissage de Montréal Diet Networks: Thin Parameters for Fat Genomics Adriana Romero, Pierre Luc Carrier, Akram Erraqabi, Tristan Sylvain, Alex Auvolat, Etienne Dejoie, Marc-André
More informationFSI Machine Vision Training Programs
FSI Machine Vision Training Programs Table of Contents Introduction to Machine Vision (Course # MVC-101) Machine Vision and NeuroCheck overview (Seminar # MVC-102) Machine Vision, EyeVision and EyeSpector
More information