Landmark Recognition with Deep Learning

Size: px
Start display at page:

Download "Landmark Recognition with Deep Learning"

Transcription

1 Landmark Recognition with Deep Learning PROJECT LABORATORY submitted by Filippo Galli NEUROSCIENTIFIC SYSTEM THEORY Technische Universität München Prof. Dr Jörg Conradt Supervisor: Marcello Mulas, PhD Final Submission:

2

3 Technische Universität München Neurowissenschaftliche Systemtheorie Project Practical Filippo Galli Landmark recognition with deep learning 05-Oct-2015 Problem description: Autonomous robotic navigation is based on the development of simultaneous localization and mapping (SLAM) strategies. In general, SLAM algorithms require the integration of odometric information with location specific sensory information. In comparison with human performance, the recognition of visual landmarks is still a problem that does not have a satisfying solution yet. However, recent developments in machine learning [1] seem promising in order to improve robotic recognition skills. In fact, deep learning techniques are currently successfully applied to several problems of visual classification. Task: The primary goal of the students is to use an existing deep learning toolbox [2,3] to recognize landmarks in an indoor environment. The images will be recorded by an on-board camera mounted on top of a mobile robot. In order to reach this goal the students shall: identify potential landmarks to recognize record a training set of images of the landmarks train a deep neural network using an available toolbox (Caffe [2] or Theano [3]) classify visible landmarks in a video recorded by a mobile robot while exploring the indoor environment compare and evaluate the performance of different deep learning networks write a report that includes a description of the work done and a summary of the most relevant obtained results. Bibliography: [1] Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comp. 18, (2006) [2] Caffe, [3] Theano, Supervisor: Marcello Mulas (Jörg Conradt) Professor

4

5 Abstract In autonomous robot navigation Simultaneous Localization and Mapping is achieved through integration of odometric and location specific informations. Among the latters, recognition of specific landmarks could be a promising choice, given latest advancements in machine learning techniques. In this report is presented a possible solution for visual landmark recognition, achieved by application of deep learning techniques to video recorded images. In particular, convolutional networks have been used for detection and classification of images containing multiple objects, thanks to open source machine learning libraries.

6 2

7 CONTENTS 3 Contents 1 Introduction 4 2 Theory Convolutional Neural Networks Logistic Regression Layer Implementation Network architecture Database Image pre-processing Results Single object testing Multiple objects with manual framing Multiple objects with automatic framing Conclusion 17 List of Figures 18 Bibliography 19

8 4 CHAPTER 1. INTRODUCTION Chapter 1 Introduction The ability to recognize landmarks in space could be a solution for retrieving local specific informations, thus allowing new possibilities for Simultaneous Localization and Mapping (SLAM) in the field of autonomous robot navigation. Given recent progresses in deep learning, a promising approach consists in the use of Neural Networks for image recognition, in particular adopting Convolutional Neural Networks [3]. The idea is to train a network for classifying objects depicted on images. In this project, this is achieved by the analysis of frames streamed by a webcam mounted on top of a mobile robot. In the following chapters the problem is tackled as follows: 1. Theory: an overview on what Convolutional Neural Networks are, how they work, and what differs from other kinds of Neural networks 2. Implementation: available toolboxes for machine learning algorithms, chosen network architecture, how the database has been built and how images have been pre-processed to make the project working efficiently 3. Results: what was achieved in terms of image recognition 4. Conclusions: Analysis on what didn t work and where to concentrate future efforts.

9 5 Chapter 2 Theory This chapter will shortly introduce the reader to the theory behind the adopted techniques. As mentioned, one of the most high performance network architecture for image recognition and classification is the convolutional network. Being part of the family of deep learning algorithms, its brightest side lays on the ability of abstracting characteristic features of inputs. Differently from the Single or Multi Layer Perceptron (MLP), Convolutional Neural Networks (CNN) exploit spatial correlation, with units being tiled in such a way that in adjacent layers, neurons react not depending on all the units of the previous layer, but only on a subset of them. This choice allows a more deep resemblance with how animal brain processes images. In fact, cells in our visual cortex are sensitive to small sub-regions of the visual field, called receptive fields, and they act like filters, processing the input space, i.e., what we actually see, exploiting the strong spacial local correlation present in natural images. 2.1 Convolutional Neural Networks The keypoint in CNN is then that information is not retrieved only from the very unit, but also by its position among the others. This is achieved by bulding a network based on the following concepts: Local receptive fields: In MLPs, units in every layer are fully connected with the units in the previous layer. In CNN to every unit is assigned a subset of them, the local receptive field, as shown in 2.1. Shared weights In CNN weights and biases in a layer are the same set for all neurons, meaning that for a single layer every neuron assigns to the units in his receptive field

10 6 CHAPTER 2. THEORY Figure 2.1: Local receptive field: How neurons are connected in different layers [4] the same set of weights. A possible interpretation of this could be that every neuron in a single layer is excited and detects the same feature. This kind of layer is then also called feature map and constitute the convolutional layer. Usually many feature maps are organized in the network, working on the same input, in order to detect different features. Pooling Usually, after convolutional layers are present pooling layers. One of the most common kind is the max-pooling layer, in which for every feature map, information is condensed in smaller layers. For instance a max-pooling unit may keep the maximum activation in a 2x2 subregion of the feature map and output that one only value. By doing that, computational efforts are strongly minimized without a great loss of information, since the position of the most evident feature in that region is preserved. 2.2 Logistic Regression Layer All the information coming from convolutional layers, condensed by the max-pooling layers, need to be elaborated to get an output, i.e., the class which the object represented in the input image belongs to. A clever way to do that is to add a Logistic Regression layer. The Logistic regressor is a probabilistic linear classifier where all units are fully connected with the units of previous layer, meaning that there s no local receptive field or subregions, but

11 2.2. LOGISTIC REGRESSION LAYER 7 every unit output is function of all the adjacent layers units output and weight. In detail, the probability that an input vector x is a member of a class i, an outcome of a stochastic variable Y, can be written as: P (Y = i x, W, b) = softmax i (W x + b) = exp(w ix + b i ) exp(w j x + b j ) j (2.1) Where W is the weight matrix and b is the weight vector. Hence the model prediction is the class whose probability is maximal: y p red = argmax i P (Y = i x, W, b) (2.2)

12 8 CHAPTER 3. IMPLEMENTATION Chapter 3 Implementation Of the two available toolboxes suggested for the implementation, i.e. Theano ([1] and [2]) 1 and Caffé 2, the former was chosen. On the one hand Caffé is allegedly the fastest library for implementing machine learning algorithms and is written in C. Theano, on the other hand is a Python library, allowing fastest prototyping and testing, in spite of less computational speed, which was anyway not the main topic of the project. Moreover, the Theano framework was handled through the use of Lasagne, a lightweight Python library for building and training neural networks. 3.1 Network architecture The tested architecture for the convolutional network was built in the following way: Input Layer The input layer has been built as a square of 80x80 units. In fact, this will be the size of the input image in terms of pixels. Moreover, this images are coded as one-channel input, meaning that are described by greyscale levels, instead of RGB values. These choices will be motivated in the following paragraphs. Hidden Convolutional Layer This convolutional layer defines 32 feature maps. Their local receptive field is a square of 5x5 pixels. Right after the feature maps a max-poolinng layer is set, with a pool size of 2x2 pixels. Hidden Convolutional Layer Consequently, another convolutional layer is added with the same parameters: 32 feature maps, with a 5x5 local receptive field followed by a max-pooling layer with 2x2 pool size

13 3.2. DATABASE 9 Figure 3.1: Four classes. From top to bottom, from left to right: watering can, plant, bookcase, table Fully-connected Hidden Layer A layer with 256 units fully connected with the the previous layer. Logistic Regression Output Layer A 6 unit fully connected Logistic regressor for obtaining the output of the network, one unit for each class. 3.2 Database Being tested in the corridor of an academic department, the trained network should at last recognize some typical landmarks characteric of the ambient. Though the choice was arbitrary, some technological constraints rose: Camera contraints: The video-recording camera placed on the mobile robot was able to get images of size 640x480 pixels, taken from almost ground level, thus restricting the choice of landmarks to small and floor-level objects. Class information content: The choice, for instance, of a white wall as landmark would have lead to a class with a very limited amount of information, since a white wall in most cases was also the background of the other object images. The choice then fell on the following object classes: table, fire extinguisher, bookcase, watering can, plant and trash bin. In Fig. 3.1 are shown some examples of the raw images. In order to train a network able to recognize objects in different conditions with respect to the database, images have been taken with different background condition

14 10 CHAPTER 3. IMPLEMENTATION Figure 3.2: Different images of the same object: trash bin Figure 3.3: Multiple object images and different object orientation. In Fig. 3.2 is shown how the attempt to train the network to focus only on the invariances of an object reflects on the database. The database, containig 3000 images is then subdivided in 3 sub-datasets for training, validation and testing. Some more images containing multiple objects have been taken for demonstration purposes and are represented in Fig Image pre-processing Images recorded by the camera on the mobile robot are RGB files of dimension 640x480 pixels. Pre-processing was need for avoiding: Memory overload: Since in Theano and Lasagne all data have to be loaded

15 3.3. IMAGE PRE-PROCESSING 11 Figure 3.4: Comparison of informations contained in the luminance plane (top right), and in the chrominance plane (bottom row) with respect to the original image (top left). at the same time during training, and the database consisting of 3000 images, the machine on which the project was carried out ran out of memory, making it impossible to complete the training of the network. Time consuption: By using raw images the CNN would have required way more time to complete training, thus slowing the testing of the code, with no actual benefit, since lots of informations contained in the images are no help for the classification problem. The proposed solution to these problems requires two steps of image manipulation. First the object contained in the image was cropped, and only a subset of pixels was kept for training. From the original image only a square of variable size (400x400 for the table class, 300x300 for the others) was obtained and a following resizing to 80x80 was applied. Secondly, the image was turned from a RGB to a greyscale file, factorizing by 3 the efforts required by the network for training, yet keeping a vast amount of information. In fact, as shown in Fig. 3.4, the most of it is contained in luminance and not in chrominance. The result of the two steps of image manipulation can be seen in Fig. 3.5.

16 12 CHAPTER 3. IMPLEMENTATION Figure 3.5: Fire extinguisher after applying cropping, resizing, and greyscaling

17 13 Chapter 4 Results Training the network for 100 epochs produced the following results. In Fig. 4.1 is shown the trend of validation accuracy, i.e. the percentage of samples in the validation set assigned to the correct class. It rises rapidly to %, and then saturates, never reaching 100 %. Other tests have been tried for clearing network performances: Single object testing Multiple object with manual framing Multiple object with automatic framing 4.1 Single object testing A manual testing of the actual performace of the CNN consisted in asking the model for prediction on single images, after training and saving model parameters. Fed images are shown in Fig. 4.2 and the results in Fig. 4.3 Figure 4.1: Validation accuracy (%) for every epoch during training

18 14 CHAPTER 4. RESULTS Figure 4.2: Images fed as input for manual testing. From left to right, top to bottom: 405.jpg, 966.jpg, 1426.jpg, 1910.jpg, 2436.jpg, 2593.jpg Figure 4.3: Results of classification for images represented in Fig. 4.2

19 4.2. MULTIPLE OBJECTS WITH MANUAL FRAMING 15 Figure 4.4: Single objects cropped from the top row images of Fig. 2.jpg, 3.jpg, 4.jpg 3.3: 1.jpg, Figure 4.5: Classification results of images in Fig Multiple objects with manual framing From the top row images showed in Fig.3.3 containing multiple objects, single object images have been obtained and are represented in Fig.4.4.Results are shown in Fig Three out of four images have been classified correctly. The CNN failed on the image of a plant, where the great part of the plant is actually cut off the photo, and basically the prediction has been made only from the vase, mistaken for the trash bin. 4.3 Multiple objects with automatic framing This test required the trained CNN to check a multiple objects image for finding and labeling them. This task was approched by making a squared frame run through the image, and by saving for each class the highest confidence level retrieved among all squared frames. After this process, if a class showed a confidence level higher than an a-priori value set to 99.4%, the corresponding object coordinates are framed and labeled accordingly. Since this euristic would have required a lot of time for a manual efficient determination, this test did not reached great results. Moreover, objects standing on the sides of the image clearly introduce a further complication, since they cannot be centered in the frame, as the network would espect from the training dataset.

20 16 CHAPTER 4. RESULTS Figure 4.6: Multiple object on-image-labeling with automatic framing: detected objects are framed and labeled on the the top left corner of the square Nevertheless, on the bottom row image of Fig 3.3 a satisfying result was obtained, and showed on Fig.4.6.

21 17 Chapter 5 Conclusion As seen in 4, object classification through CNN has had some interesting results. In fact, once the model is trained, making predictions on new images does not require a lot of computational power and at the same time shows, under some constraints, good performances. One of the most tricky issues showed to be the fact that when dealing with multiple objects contained in the same image, which is actually a real world scenario, automatic framing can visibly lower performance, due to the following reasons: Objects cannot always be centered in a squared frame, for instance if they stand by the sides of the picture. As it is now, the algorithm used for automatic framing lacks of the characteristic to be scaling invariant, meaning that the dimension of the object matters and may lead to failed classification or recognition. Moreover, from some objects, like for instance the table, it is difficult to extract features invariant from orientation, and this can complicate the task of recognizing the object itself from different points of view. Hence, results evident that a great enphases should be put on the building of the database and pre-processing, for solving current issues and enhancing preformances. At last, since the vast majority of the time spent on the project was dedicated to get the CNN working, no chance to adopt and try different architectures was possible.

22 18 LIST OF FIGURES List of Figures 2.1 Local receptive field: How neurons are connected in different layers Four classes. From top to bottom, from left to right: watering can, plant, bookcase, table Different images of the same object: trash bin Multiple object images Comparison of informations contained in the luminance plane (top right), and in the chrominance plane (bottom row) with respect to the original image (top left) Fire extinguisher after applying cropping, resizing, and greyscaling Validation accuracy (%) for every epoch during training Images fed as input for manual testing. From left to right, top to bottom: 405.jpg, 966.jpg, 1426.jpg, 1910.jpg, 2436.jpg, 2593.jpg Results of classification for images represented in Fig Single objects cropped from the top row images of Fig. 3.3: 1.jpg, 2.jpg, 3.jpg, 4.jpg Classification results of images in Fig Multiple object on-image-labeling with automatic framing: detected objects are framed and labeled on the the top left corner of the square 16

23 BIBLIOGRAPHY 19 Bibliography [1] Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop, [2] James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. Theano: a CPU and GPU math expression compiler, June Oral Presentation. [3] Yann LeCun and Yoshua Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), [4] Michael A. Nielsen. Neural Networks and Deep Learning. Determination Press, 2005.

24 20 BIBLIOGRAPHY License This work is licensed under the Creative Commons Attribution 3.0 Germany License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California 94105, USA.

Coursework 2. MLP Lecture 7 Convolutional Networks 1

Coursework 2. MLP Lecture 7 Convolutional Networks 1 Coursework 2 MLP Lecture 7 Convolutional Networks 1 Coursework 2 - Overview and Objectives Overview: Use a selection of the techniques covered in the course so far to train accurate multi-layer networks

More information

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks Attention-based Multi-Encoder-Decoder Recurrent Neural Networks Stephan Baier 1, Sigurd Spieckermann 2 and Volker Tresp 1,2 1- Ludwig Maximilian University Oettingenstr. 67, Munich, Germany 2- Siemens

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Deep Learning Barnabás Póczos Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey Hinton Yann LeCun 2

More information

Generating an appropriate sound for a video using WaveNet.

Generating an appropriate sound for a video using WaveNet. Australian National University College of Engineering and Computer Science Master of Computing Generating an appropriate sound for a video using WaveNet. COMP 8715 Individual Computing Project Taku Ueki

More information

Number Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices

Number Plate Detection with a Multi-Convolutional Neural Network Approach with Optical Character Recognition for Mobile Devices J Inf Process Syst, Vol.12, No.1, pp.100~108, March 2016 http://dx.doi.org/10.3745/jips.04.0022 ISSN 1976-913X (Print) ISSN 2092-805X (Electronic) Number Plate Detection with a Multi-Convolutional Neural

More information

Research on Hand Gesture Recognition Using Convolutional Neural Network

Research on Hand Gesture Recognition Using Convolutional Neural Network Research on Hand Gesture Recognition Using Convolutional Neural Network Tian Zhaoyang a, Cheng Lee Lung b a Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China E-mail address:

More information

arxiv: v1 [cs.ce] 9 Jan 2018

arxiv: v1 [cs.ce] 9 Jan 2018 Predict Forex Trend via Convolutional Neural Networks Yun-Cheng Tsai, 1 Jun-Hao Chen, 2 Jun-Jie Wang 3 arxiv:1801.03018v1 [cs.ce] 9 Jan 2018 1 Center for General Education 2,3 Department of Computer Science

More information

Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks

Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks Stephan Baier1, Sigurd Spieckermann2 and Volker Tresp1,2 1- Ludwig Maximilian University Oettingenstr. 67, Munich,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Perceptron Barnabás Póczos Contents History of Artificial Neural Networks Definitions: Perceptron, Multi-Layer Perceptron Perceptron algorithm 2 Short History of Artificial

More information

Biologically Inspired Computation

Biologically Inspired Computation Biologically Inspired Computation Deep Learning & Convolutional Neural Networks Joe Marino biologically inspired computation biological intelligence flexible capable of detecting/ executing/reasoning about

More information

Image Extraction using Image Mining Technique

Image Extraction using Image Mining Technique IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 9 (September. 2013), V2 PP 36-42 Image Extraction using Image Mining Technique Prof. Samir Kumar Bandyopadhyay,

More information

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 6. Convolutional Neural Networks (Some figures adapted from NNDL book) 1 Convolution Neural Networks 1. Convolutional Neural Networks Convolution,

More information

Convolutional Neural Networks: Real Time Emotion Recognition

Convolutional Neural Networks: Real Time Emotion Recognition Convolutional Neural Networks: Real Time Emotion Recognition Bruce Nguyen, William Truong, Harsha Yeddanapudy Motivation: Machine emotion recognition has long been a challenge and popular topic in the

More information

GPU ACCELERATED DEEP LEARNING WITH CUDNN

GPU ACCELERATED DEEP LEARNING WITH CUDNN GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION

More information

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni. Lesson 08 Convolutional Neural Network Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni Lesson 08 Convolution we will consider 2D convolution the result

More information

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland An Introduction to Convolutional Neural Networks Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland Sources & Resources - Andrej Karpathy, CS231n http://cs231n.github.io/convolutional-networks/

More information

Radio Deep Learning Efforts Showcase Presentation

Radio Deep Learning Efforts Showcase Presentation Radio Deep Learning Efforts Showcase Presentation November 2016 hume@vt.edu www.hume.vt.edu Tim O Shea Senior Research Associate Program Overview Program Objective: Rethink fundamental approaches to how

More information

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB S. Kajan, J. Goga Institute of Robotics and Cybernetics, Faculty of Electrical Engineering and Information Technology, Slovak University

More information

CLASSLESS ASSOCIATION USING NEURAL NETWORKS

CLASSLESS ASSOCIATION USING NEURAL NETWORKS Workshop track - ICLR 1 CLASSLESS ASSOCIATION USING NEURAL NETWORKS Federico Raue 1,, Sebastian Palacio, Andreas Dengel 1,, Marcus Liwicki 1 1 University of Kaiserslautern, Germany German Research Center

More information

Deep Learning. Dr. Johan Hagelbäck.

Deep Learning. Dr. Johan Hagelbäck. Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:

More information

Image Manipulation Detection using Convolutional Neural Network

Image Manipulation Detection using Convolutional Neural Network Image Manipulation Detection using Convolutional Neural Network Dong-Hyun Kim 1 and Hae-Yeoun Lee 2,* 1 Graduate Student, 2 PhD, Professor 1,2 Department of Computer Software Engineering, Kumoh National

More information

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS Kuan-Chuan Peng and Tsuhan Chen Cornell University School of Electrical and Computer Engineering Ithaca, NY 14850

More information

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO Introduction to RNNs for NLP SHANG GAO About Me PhD student in the Data Science and Engineering program Took Deep Learning last year Work in the Biomedical Sciences, Engineering, and Computing group at

More information

DETECTION AND RECOGNITION OF HAND GESTURES TO CONTROL THE SYSTEM APPLICATIONS BY NEURAL NETWORKS. P.Suganya, R.Sathya, K.

DETECTION AND RECOGNITION OF HAND GESTURES TO CONTROL THE SYSTEM APPLICATIONS BY NEURAL NETWORKS. P.Suganya, R.Sathya, K. Volume 118 No. 10 2018, 399-405 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: 10.12732/ijpam.v118i10.40 ijpam.eu DETECTION AND RECOGNITION OF HAND GESTURES

More information

Image Finder Mobile Application Based on Neural Networks

Image Finder Mobile Application Based on Neural Networks Image Finder Mobile Application Based on Neural Networks Nabil M. Hewahi Department of Computer Science, College of Information Technology, University of Bahrain, Sakheer P.O. Box 32038, Kingdom of Bahrain

More information

Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks

Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks Emad M. Grais, Gerard Roma, Andrew J.R. Simpson, and Mark D. Plumbley Centre for Vision, Speech and Signal

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to publication record in Explore Bristol Research PDF-document

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to publication record in Explore Bristol Research PDF-document Hepburn, A., McConville, R., & Santos-Rodriguez, R. (2017). Album cover generation from genre tags. Paper presented at 10th International Workshop on Machine Learning and Music, Barcelona, Spain. Peer

More information

Convolutional Networks Overview

Convolutional Networks Overview Convolutional Networks Overview Sargur Srihari 1 Topics Limitations of Conventional Neural Networks The convolution operation Convolutional Networks Pooling Convolutional Network Architecture Advantages

More information

Classifying the Brain's Motor Activity via Deep Learning

Classifying the Brain's Motor Activity via Deep Learning Final Report Classifying the Brain's Motor Activity via Deep Learning Tania Morimoto & Sean Sketch Motivation Over 50 million Americans suffer from mobility or dexterity impairments. Over the past few

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

arxiv: v3 [cs.cv] 18 Dec 2018

arxiv: v3 [cs.cv] 18 Dec 2018 Video Colorization using CNNs and Keyframes extraction: An application in saving bandwidth Ankur Singh 1 Anurag Chanani 2 Harish Karnick 3 arxiv:1812.03858v3 [cs.cv] 18 Dec 2018 Abstract In this paper,

More information

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 1 Olaf Ronneberger, Philipp Fischer, Thomas Brox (Freiburg, Germany) 2 Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (POSTECH,

More information

MINE 432 Industrial Automation and Robotics

MINE 432 Industrial Automation and Robotics MINE 432 Industrial Automation and Robotics Part 3, Lecture 5 Overview of Artificial Neural Networks A. Farzanegan (Visiting Associate Professor) Fall 2014 Norman B. Keevil Institute of Mining Engineering

More information

Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning

Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning Comparing Time and Frequency Domain for Audio Event Recognition Using Deep Learning Lars Hertel, Huy Phan and Alfred Mertins Institute for Signal Processing, University of Luebeck, Germany Graduate School

More information

Automated Planetary Terrain Mapping of Mars Using Image Pattern Recognition

Automated Planetary Terrain Mapping of Mars Using Image Pattern Recognition Automated Planetary Terrain Mapping of Mars Using Image Pattern Recognition Design Document Version 2.0 Team Strata: Sean Baquiro Matthew Enright Jorge Felix Tsosie Schneider 2 Table of Contents 1 Introduction.3

More information

A Neural Algorithm of Artistic Style (2015)

A Neural Algorithm of Artistic Style (2015) A Neural Algorithm of Artistic Style (2015) Leon A. Gatys, Alexander S. Ecker, Matthias Bethge Nancy Iskander (niskander@dgp.toronto.edu) Overview of Method Content: Global structure. Style: Colours; local

More information

Colorful Image Colorizations Supplementary Material

Colorful Image Colorizations Supplementary Material Colorful Image Colorizations Supplementary Material Richard Zhang, Phillip Isola, Alexei A. Efros {rich.zhang, isola, efros}@eecs.berkeley.edu University of California, Berkeley 1 Overview This document

More information

ECE 599/692 Deep Learning Lecture 19 Beyond BP and CNN

ECE 599/692 Deep Learning Lecture 19 Beyond BP and CNN ECE 599/692 Deep Learning Lecture 19 Beyond BP and CNN Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi

More information

Wadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology

Wadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 1) Available online at www.ijariit.com Hand Detection and Gesture Recognition in Real-Time Using Haar-Classification and Convolutional Neural Networks

More information

Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired

Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired 1 Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired Bing Li 1, Manjekar Budhai 2, Bowen Xiao 3, Liang Yang 1, Jizhong Xiao 1 1 Department of Electrical Engineering, The City College,

More information

Deep Neural Network Architectures for Modulation Classification

Deep Neural Network Architectures for Modulation Classification Deep Neural Network Architectures for Modulation Classification Xiaoyu Liu, Diyu Yang, and Aly El Gamal School of Electrical and Computer Engineering Purdue University Email: {liu1962, yang1467, elgamala}@purdue.edu

More information

MLP for Adaptive Postprocessing Block-Coded Images

MLP for Adaptive Postprocessing Block-Coded Images 1450 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 10, NO. 8, DECEMBER 2000 MLP for Adaptive Postprocessing Block-Coded Images Guoping Qiu, Member, IEEE Abstract A new technique

More information

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Emeric Stéphane Boigné eboigne@stanford.edu Jan Felix Heyse heyse@stanford.edu Abstract Scaling

More information

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET MOTIVATION Fully connected neural network Example 1000x1000 image 1M hidden units 10 12 (= 10 6 10 6 ) parameters! Observation

More information

An Hybrid MLP-SVM Handwritten Digit Recognizer

An Hybrid MLP-SVM Handwritten Digit Recognizer An Hybrid MLP-SVM Handwritten Digit Recognizer A. Bellili ½ ¾ M. Gilloux ¾ P. Gallinari ½ ½ LIP6, Université Pierre et Marie Curie ¾ La Poste 4, Place Jussieu 10, rue de l Ile Mabon, BP 86334 75252 Paris

More information

arxiv: v1 [cs.lg] 2 Jan 2018

arxiv: v1 [cs.lg] 2 Jan 2018 Deep Learning for Identifying Potential Conceptual Shifts for Co-creative Drawing arxiv:1801.00723v1 [cs.lg] 2 Jan 2018 Pegah Karimi pkarimi@uncc.edu Kazjon Grace The University of Sydney Sydney, NSW 2006

More information

Automated hand recognition as a human-computer interface

Automated hand recognition as a human-computer interface Automated hand recognition as a human-computer interface Sergii Shelpuk SoftServe, Inc. sergii.shelpuk@gmail.com Abstract This paper investigates applying Machine Learning to the problem of turning a regular

More information

Convolutional Neural Network-based Steganalysis on Spatial Domain

Convolutional Neural Network-based Steganalysis on Spatial Domain Convolutional Neural Network-based Steganalysis on Spatial Domain Dong-Hyun Kim, and Hae-Yeoun Lee Abstract Steganalysis has been studied to detect the existence of hidden messages by steganography. However,

More information

Recognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83

Recognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Recognition: Overview Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Textbook This book has a lot of material: K. Grauman and B. Leibe Visual Object Recognition Synthesis Lectures On Computer

More information

Embedding Artificial Intelligence into Our Lives

Embedding Artificial Intelligence into Our Lives Embedding Artificial Intelligence into Our Lives Michael Thompson, Synopsys D&R IP-SOC DAYS Santa Clara April 2018 1 Agenda Introduction What AI is and is Not Where AI is being used Rapid Advance of AI

More information

Impact of Automatic Feature Extraction in Deep Learning Architecture

Impact of Automatic Feature Extraction in Deep Learning Architecture Impact of Automatic Feature Extraction in Deep Learning Architecture Fatma Shaheen, Brijesh Verma and Md Asafuddoula Centre for Intelligent Systems Central Queensland University, Brisbane, Australia {f.shaheen,

More information

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at Quora,

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at Quora, Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Presentation at Quora, 2016-08-04 In this presentation Intriguing Properties of Neural Networks Szegedy et al, 2013

More information

Lecture 17 Convolutional Neural Networks

Lecture 17 Convolutional Neural Networks Lecture 17 Convolutional Neural Networks 30 March 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/22 Notes: Problem set 6 is online and due next Friday, April 8th Problem sets 7,8, and 9 will be due

More information

INFORMATION about image authenticity can be used in

INFORMATION about image authenticity can be used in 1 Constrained Convolutional Neural Networs: A New Approach Towards General Purpose Image Manipulation Detection Belhassen Bayar, Student Member, IEEE, and Matthew C. Stamm, Member, IEEE Abstract Identifying

More information

GESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING

GESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING 2017 NDIA GROUND VEHICLE SYSTEMS ENGINEERING AND TECHNOLOGY SYMPOSIUM AUTONOMOUS GROUND SYSTEMS (AGS) TECHNICAL SESSION AUGUST 8-10, 2017 - NOVI, MICHIGAN GESTURE RECOGNITION FOR ROBOTIC CONTROL USING

More information

EMERGENCE OF FOVEAL IMAGE SAMPLING FROM

EMERGENCE OF FOVEAL IMAGE SAMPLING FROM EMERGENCE OF FOVEAL IMAGE SAMPLING FROM LEARNING TO ATTEND IN VISUAL SCENES Brian Cheung, Eric Weiss, Bruno Olshausen Redwood Center UC Berkeley {bcheung,eaweiss,baolshausen}@berkeley.edu ABSTRACT We describe

More information

INTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013

INTRODUCTION TO DEEP LEARNING. Steve Tjoa June 2013 INTRODUCTION TO DEEP LEARNING Steve Tjoa kiemyang@gmail.com June 2013 Acknowledgements http://ufldl.stanford.edu/wiki/index.php/ UFLDL_Tutorial http://youtu.be/ayzoubkuf3m http://youtu.be/zmnoatzigik 2

More information

CHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION

CHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION CHAPTER 8: EXTENDED TETRACHORD CLASSIFICATION Chapter 7 introduced the notion of strange circles: using various circles of musical intervals as equivalence classes to which input pitch-classes are assigned.

More information

Intelligent Non-Player Character with Deep Learning. Intelligent Non-Player Character with Deep Learning 1

Intelligent Non-Player Character with Deep Learning. Intelligent Non-Player Character with Deep Learning 1 Intelligent Non-Player Character with Deep Learning Meng Zhixiang, Zhang Haoze Supervised by Prof. Michael Lyu CUHK CSE FYP Term 1 Intelligent Non-Player Character with Deep Learning 1 Intelligent Non-Player

More information

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at HORSE 2016 London,

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at HORSE 2016 London, Adversarial Examples and Adversarial Training Ian Goodfellow, OpenAI Research Scientist Presentation at HORSE 2016 London, 2016-09-19 In this presentation Intriguing Properties of Neural Networks Szegedy

More information

Artificial Intelligence: Using Neural Networks for Image Recognition

Artificial Intelligence: Using Neural Networks for Image Recognition Kankanahalli 1 Sri Kankanahalli Natalie Kelly Independent Research 12 February 2010 Artificial Intelligence: Using Neural Networks for Image Recognition Abstract: The engineering goals of this experiment

More information

Vehicle Color Recognition using Convolutional Neural Network

Vehicle Color Recognition using Convolutional Neural Network Vehicle Color Recognition using Convolutional Neural Network Reza Fuad Rachmadi and I Ketut Eddy Purnama Multimedia and Network Engineering Department, Institut Teknologi Sepuluh Nopember, Keputih Sukolilo,

More information

Convolutional neural networks

Convolutional neural networks Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions

More information

Robust Hand Gesture Recognition for Robotic Hand Control

Robust Hand Gesture Recognition for Robotic Hand Control Robust Hand Gesture Recognition for Robotic Hand Control Ankit Chaudhary Robust Hand Gesture Recognition for Robotic Hand Control 123 Ankit Chaudhary Department of Computer Science Northwest Missouri State

More information

Decoding Brainwave Data using Regression

Decoding Brainwave Data using Regression Decoding Brainwave Data using Regression Justin Kilmarx: The University of Tennessee, Knoxville David Saffo: Loyola University Chicago Lucien Ng: The Chinese University of Hong Kong Mentor: Dr. Xiaopeng

More information

Playing Atari Games with Deep Reinforcement Learning

Playing Atari Games with Deep Reinforcement Learning Playing Atari Games with Deep Reinforcement Learning 1 Playing Atari Games with Deep Reinforcement Learning Varsha Lalwani (varshajn@iitk.ac.in) Masare Akshay Sunil (amasare@iitk.ac.in) IIT Kanpur CS365A

More information

On Intelligence Jeff Hawkins

On Intelligence Jeff Hawkins On Intelligence Jeff Hawkins Chapter 8: The Future of Intelligence April 27, 2006 Presented by: Melanie Swan, Futurist MS Futures Group 650-681-9482 m@melanieswan.com http://www.melanieswan.com Building

More information

Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition

Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad Road, Rajkot Gujarat, India C. K. Kumbharana,

More information

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 9, Issue 3, May - June 2018, pp. 177 185, Article ID: IJARET_09_03_023 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=9&itype=3

More information

Transactions on Information and Communications Technologies vol 1, 1993 WIT Press, ISSN

Transactions on Information and Communications Technologies vol 1, 1993 WIT Press,   ISSN Combining multi-layer perceptrons with heuristics for reliable control chart pattern classification D.T. Pham & E. Oztemel Intelligent Systems Research Laboratory, School of Electrical, Electronic and

More information

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks Mariam Yiwere 1 and Eun Joo Rhee 2 1 Department of Computer Engineering, Hanbat National University,

More information

Derek Allman a, Austin Reiter b, and Muyinatu Bell a,c

Derek Allman a, Austin Reiter b, and Muyinatu Bell a,c Exploring the effects of transducer models when training convolutional neural networks to eliminate reflection artifacts in experimental photoacoustic images Derek Allman a, Austin Reiter b, and Muyinatu

More information

Handling Emotions in Human-Computer Dialogues

Handling Emotions in Human-Computer Dialogues Handling Emotions in Human-Computer Dialogues Johannes Pittermann Angela Pittermann Wolfgang Minker Handling Emotions in Human-Computer Dialogues ABC Johannes Pittermann Universität Ulm Inst. Informationstechnik

More information

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF 95 CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF 6.1 INTRODUCTION An artificial neural network (ANN) is an information processing model that is inspired by biological nervous systems

More information

Application of Multi Layer Perceptron (MLP) for Shower Size Prediction

Application of Multi Layer Perceptron (MLP) for Shower Size Prediction Chapter 3 Application of Multi Layer Perceptron (MLP) for Shower Size Prediction 3.1 Basic considerations of the ANN Artificial Neural Network (ANN)s are non- parametric prediction tools that can be used

More information

arxiv: v1 [cs.lg] 30 May 2016

arxiv: v1 [cs.lg] 30 May 2016 Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent Timothy J O Shea and T. Charles Clancy Virginia Polytechnic Institute and State University arxiv:1605.09221v1

More information

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification

The Automatic Classification Problem. Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Perceptrons, SVMs, and Friends: Some Discriminative Models for Classification Parallel to AIMA 8., 8., 8.6.3, 8.9 The Automatic Classification Problem Assign object/event or sequence of objects/events

More information

Proposers Day Workshop

Proposers Day Workshop Proposers Day Workshop Monday, January 23, 2017 @srcjump, #JUMPpdw Cognitive Computing Vertical Research Center Mandy Pant Academic Research Director Intel Corporation Center Motivation Today s deep learning

More information

Libyan Licenses Plate Recognition Using Template Matching Method

Libyan Licenses Plate Recognition Using Template Matching Method Journal of Computer and Communications, 2016, 4, 62-71 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.47009 Libyan Licenses Plate Recognition Using

More information

A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures

A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures D.M. Rojas Castro, A. Revel and M. Ménard * Laboratory of Informatics, Image and Interaction (L3I)

More information

Scrabble Board Automatic Detector for Third Party Applications

Scrabble Board Automatic Detector for Third Party Applications Scrabble Board Automatic Detector for Third Party Applications David Hirschberg Computer Science Department University of California, Irvine hirschbd@uci.edu Abstract Abstract Scrabble is a well-known

More information

Comparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics

Comparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics University of Arkansas, Fayetteville ScholarWorks@UARK Computer Science and Computer Engineering Undergraduate Honors Theses Computer Science and Computer Engineering 5-2018 Comparison of Google Image

More information

Enhancing Symmetry in GAN Generated Fashion Images

Enhancing Symmetry in GAN Generated Fashion Images Enhancing Symmetry in GAN Generated Fashion Images Vishnu Makkapati 1 and Arun Patro 2 1 Myntra Designs Pvt. Ltd., Bengaluru - 560068, India vishnu.makkapati@myntra.com 2 Department of Electrical Engineering,

More information

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey

TOOLS AND PROCESSORS FOR COMPUTER VISION. Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey TOOLS AND PROCESSORS FOR COMPUTER VISION Selected Results from the Embedded Vision Alliance s Spring 2017 Computer Vision Developer Survey 1 EXECUTIVE SUMMARY Since 2015, the Embedded Vision Alliance has

More information

Multimedia Forensics

Multimedia Forensics Multimedia Forensics Using Mathematics and Machine Learning to Determine an Image's Source and Authenticity Matthew C. Stamm Multimedia & Information Security Lab (MISL) Department of Electrical and Computer

More information

Creating an Agent of Doom: A Visual Reinforcement Learning Approach

Creating an Agent of Doom: A Visual Reinforcement Learning Approach Creating an Agent of Doom: A Visual Reinforcement Learning Approach Michael Lowney Department of Electrical Engineering Stanford University mlowney@stanford.edu Robert Mahieu Department of Electrical Engineering

More information

DESIGN & DEVELOPMENT OF COLOR MATCHING ALGORITHM FOR IMAGE RETRIEVAL USING HISTOGRAM AND SEGMENTATION TECHNIQUES

DESIGN & DEVELOPMENT OF COLOR MATCHING ALGORITHM FOR IMAGE RETRIEVAL USING HISTOGRAM AND SEGMENTATION TECHNIQUES International Journal of Information Technology and Knowledge Management July-December 2011, Volume 4, No. 2, pp. 585-589 DESIGN & DEVELOPMENT OF COLOR MATCHING ALGORITHM FOR IMAGE RETRIEVAL USING HISTOGRAM

More information

Norsk Regnesentral (NR) Norwegian Computing Center

Norsk Regnesentral (NR) Norwegian Computing Center Norsk Regnesentral (NR) Norwegian Computing Center Petter Abrahamsen Joining Forces 2018 www.nr.no NUSSE: - 512 9-digit numbers - 200 additions/second Our latest servers: - Four Titan X GPUs - 14 336 cores

More information

Chapter 2 Transformation Invariant Image Recognition Using Multilayer Perceptron 2.1 Introduction

Chapter 2 Transformation Invariant Image Recognition Using Multilayer Perceptron 2.1 Introduction Chapter 2 Transformation Invariant Image Recognition Using Multilayer Perceptron 2.1 Introduction A multilayer perceptron (MLP) [52, 53] comprises an input layer, any number of hidden layers and an output

More information

Research on Application of Conjoint Neural Networks in Vehicle License Plate Recognition

Research on Application of Conjoint Neural Networks in Vehicle License Plate Recognition International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 11, Number 10 (2018), pp. 1499-1510 International Research Publication House http://www.irphouse.com Research on Application

More information

ENHANCED HUMAN-AGENT INTERACTION: AUGMENTING INTERACTION MODELS WITH EMBODIED AGENTS BY SERAFIN BENTO. MASTER OF SCIENCE in INFORMATION SYSTEMS

ENHANCED HUMAN-AGENT INTERACTION: AUGMENTING INTERACTION MODELS WITH EMBODIED AGENTS BY SERAFIN BENTO. MASTER OF SCIENCE in INFORMATION SYSTEMS BY SERAFIN BENTO MASTER OF SCIENCE in INFORMATION SYSTEMS Edmonton, Alberta September, 2015 ABSTRACT The popularity of software agents demands for more comprehensive HAI design processes. The outcome of

More information

A.I in Automotive? Why and When.

A.I in Automotive? Why and When. A.I in Automotive? Why and When. AGENDA 01 02 03 04 Definitions A.I? A.I in automotive Now? Next big A.I breakthrough in Automotive 01 DEFINITIONS DEFINITIONS Artificial Intelligence Artificial Intelligence:

More information

ES 492: SCIENCE IN THE MOVIES

ES 492: SCIENCE IN THE MOVIES UNIVERSITY OF SOUTH ALABAMA ES 492: SCIENCE IN THE MOVIES LECTURE 5: ROBOTICS AND AI PRESENTER: HANNAH BECTON TODAY'S AGENDA 1. Robotics and Real-Time Systems 2. Reacting to the environment around them

More information

CS 7643: Deep Learning

CS 7643: Deep Learning CS 7643: Deep Learning Topics: Toeplitz matrices and convolutions = matrix-mult Dilated/a-trous convolutions Backprop in conv layers Transposed convolutions Dhruv Batra Georgia Tech HW1 extension 09/22

More information

CSC321 Lecture 11: Convolutional Networks

CSC321 Lecture 11: Convolutional Networks CSC321 Lecture 11: Convolutional Networks Roger Grosse Roger Grosse CSC321 Lecture 11: Convolutional Networks 1 / 35 Overview What makes vision hard? Vison needs to be robust to a lot of transformations

More information

FACE RECOGNITION USING NEURAL NETWORKS

FACE RECOGNITION USING NEURAL NETWORKS Int. J. Elec&Electr.Eng&Telecoms. 2014 Vinoda Yaragatti and Bhaskar B, 2014 Research Paper ISSN 2319 2518 www.ijeetc.com Vol. 3, No. 3, July 2014 2014 IJEETC. All Rights Reserved FACE RECOGNITION USING

More information

Classification for Motion Game Based on EEG Sensing

Classification for Motion Game Based on EEG Sensing Classification for Motion Game Based on EEG Sensing Ran WEI 1,3,4, Xing-Hua ZHANG 1,4, Xin DANG 2,3,4,a and Guo-Hui LI 3 1 School of Electronics and Information Engineering, Tianjin Polytechnic University,

More information

Reinforcement Learning Agent for Scrolling Shooter Game

Reinforcement Learning Agent for Scrolling Shooter Game Reinforcement Learning Agent for Scrolling Shooter Game Peng Yuan (pengy@stanford.edu) Yangxin Zhong (yangxin@stanford.edu) Zibo Gong (zibo@stanford.edu) 1 Introduction and Task Definition 1.1 Game Agent

More information

Diet Networks: Thin Parameters for Fat Genomics

Diet Networks: Thin Parameters for Fat Genomics Institut des algorithmes d apprentissage de Montréal Diet Networks: Thin Parameters for Fat Genomics Adriana Romero, Pierre Luc Carrier, Akram Erraqabi, Tristan Sylvain, Alex Auvolat, Etienne Dejoie, Marc-André

More information

FSI Machine Vision Training Programs

FSI Machine Vision Training Programs FSI Machine Vision Training Programs Table of Contents Introduction to Machine Vision (Course # MVC-101) Machine Vision and NeuroCheck overview (Seminar # MVC-102) Machine Vision, EyeVision and EyeSpector

More information