Automatic understanding of the visual world
|
|
- Rolf Stephens
- 5 years ago
- Views:
Transcription
1 Automatic understanding of the visual world 1
2 Machine visual perception Artificial capacity to see, understand the visual world Object recognition Image or sequence of images Action recognition 2
3 Machine visual perception applications Face detection (auto focus in cameras, surveillance) Courtesy Fujifilm Courtesy Ricoh 3
4 Machine visual perception applications Pedestrian detection, action recognition (car safety, surveillance) Courtesy Volvo Courtesy Embedded Vision Alliance 4
5 Machine visual perception applications Image retrieval (search for places/objects with a smartphone) Courtesy Google 5
6 Machine visual perception applications Complete description (story) of a video As the headwaiter takes them to a table they pass by the piano, and the woman looks at Sam. Sam, with a conscious effort, keeps his eyes on the keyboard as they go past. The headwaiter seats Ilsa... 6
7 Machine visual perception applications Complete description (story) of a video As the headwaiter takes them to a table they pass by the piano, and the woman looks at Sam. Sam, with a conscious effort, keeps his eyes on the keyboard as they go past. The headwaiter seats Ilsa... 7
8 Machine visual perception applications Complete description (story) of a video As the headwaiter takes them to a table they pass by the piano, and the woman looks at Sam. Sam, with a conscious effort, keeps his eyes on the keyboard as they go past. The headwaiter seats Ilsa... 8
9 Machine visual perception applications Complete description (story) of a video As the headwaiter takes them to a table they pass by the piano, and the woman looks at Sam. Sam, with a conscious effort, keeps his eyes on the keyboard as they go past. The headwaiter seats Ilsa... 9
10 Today s machine visual perception Machine learning Largescale & deep learning Learning with noisy labels Data (images & videos) Large quantity, but quality? Manual / weaklysupervised annotation, synthetic data Machine visual perception Understanding of the visual world Design of models 10
11 Image and video data Manually annotated data Weaksupervised learning Synthetic data Selfsupervised learning 11
12 Manual annotated collections Increase in size images in 2007 several million images today Increase in complexity of annotation Image labels Bounding boxes labels Semantic segmentation labels Action labels 12
13 PASCAL VOC Image collected from Flickr with keywords Exhaustive manual annotation with 20 classes labels and corresponding bounding boxes Separate training and test set (held out) [The PASCAL Visual Object Classes (VOC) Challenge, M. Everingham, L. Van Gool, C. Williams, J. Winn, A. Zisserman, IJCV] 13
14 ImageNet dataset ImageNet has 14M images from 22k classes ImageNet Large Scale Visual Recognition Challenge: 1000 classes and 1.4M images, image labels only [O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. Berg and L. FeiFei, ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015] 14
15 Coco dataset Coco: common objects in context 80 object classes with segmentation masks, images [T.Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, L. Zitnick, P. Dollar, Coco, ECCV, 2014] 15
16 Atomic Visual Actions (AVA) dataset Definition of atomic actions, 80 atomic actions in 65k movie clips with 197k labels, multiple labels per person [AVA: A Video Dataset of Spatiotemporally Localized Atomic Visual Actions; Gu, Sun, Vijavanarasimhan, Pantofaru, Ross, Toderici, Li, Ricco, Sukthankar, Schmid Malik, arxiv 17] 16
17 Information difficult to annotate Optical flow dense correspondence between pixels Impossible to precisely annotate manually FlyingThings dataset [Mayer et al., CVPR 16] 17
18 Information difficult to annotate 3D human shape Impossible to precisely annotate manually [F. Bogo et al., Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image, ECCV 2016] 18
19 Image and video data Manually annotated data Weaksupervised learning Synthetic data Selfsupervised learning 19
20 Visual models from weaklysupervised data Massive and ever growing amount of digital image and video content Flickr and YouTube Audiovisual archives (BBC, INA) Personal collections Comes with metadata Text, audio, user click data, 20
21 Visual models from weaklysupervised data Largescale & weakly supervised learning of visual models Object detection Action recognition 21
22 Joint learning of actors and actions Rick? Rick? Walks? Walks? Rick walks up behind Ilsa [Bojanowski et al. ICCV 2013] 22
23 Joint learning of actors and actions Rick Walks Rick walks up behind Ilsa [Bojanowski et al. ICCV 2013] 23
24
25 Weaklysupervised learning of actions [Weinzaefel et al., arxiv 17] 25
26 Extraction of human tubes Stateoftheart Faster RCNN detector Large annotated dataset of humans Humanspecific trackingbydetection approach DALY: 95% at
27 27
28 Weakly supervised learning of relations Input: Object detections + Image labels [Peyre et al., ICCV 17] 28
29 Weakly supervised learning of relations Output: Learnt spatial relations [Peyre et al., ICCV 17] 29
30 Weakly supervised learning of relations Output: Learnt spatial relations [Peyre et al., ICCV 17] 30
31 Weakly supervised learning of relations Output: Learnt spatial relations [Peyre et al., ICCV 17] 31
32 Image and video data Manually annotated data Weaksupervised learning Synthetic data Selfsupervised learning 32
33 Learning optical flow FlowNet2.0 FlyingThings dataset [Mayer et al., CVPR 16] FlowNet 2.0 [Illg et al., CVPR 17] 33
34 Visual models from synthetic data [Learning from Synthetic Humans, Varol, Romero, Martin, Mahmood, Black, Laptev, Schmid, CVPR 17] 34
35 SURREAL dataset Synthetic humans for REAL tasks a body with random 3D shape + 3D pose from MoCap data 2D image is rendered with a random camera + random lighting + random cloth texture + a random static scene image Output: RGB image, 2D/3D pose, optical flow, depth image, segmentation map for body parts
36 CAESARS dataset for human body shapes LSUN dataset for static background images CAESARS dataset and another collection of 3D scans for body textures (clothes) CMU dataset for MoCap sequences (marker data)
37 37
38 Approach for body part segmentation Stacked hourglass network [Newell et al., 2016] head 2D pose Segmentation left arm left arm backg. right foot torso MSE for regressing heatmaps Softmax error for classifying pixels as one of the parts
39 Experimental results Evaluation on Freiburg Sitting People Dataset
40 Results on YouTubePose dataset
41 Image and video data Manually annotated data Weaksupervised learning Synthetic data Selfsupervised learning 41
42 Selfsupervised learning from video Regularities in the video data are used for learning [I. Misra et al., Shuffle and Learn: Unsupervised Learning using Temporal Order Verification, ECCV 16] 42
43 Automatic video object segmentation [Learning Video Object Segmentation with Visual Memory, Tokmakov et al., ICCV 17] 43
44 Conclusion Recent largescale data collection Key to next generation systems Importance moving away from fully supervised approaches Crossmodal learning from vision, language and robotics 44
45 Merci! Suiveznous sur
Colorful Image Colorizations Supplementary Material
Colorful Image Colorizations Supplementary Material Richard Zhang, Phillip Isola, Alexei A. Efros {rich.zhang, isola, efros}@eecs.berkeley.edu University of California, Berkeley 1 Overview This document
More informationImproving a real-time object detector with compact temporal information
Improving a real-time object detector with compact temporal information Martin Ahrnbom Lund University martin.ahrnbom@math.lth.se Morten Bornø Jensen Aalborg University mboj@create.aau.dk Håkan Ardö Lund
More informationVideo Object Segmentation with Re-identification
Video Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen Change Loy, Xiaoou Tang The Chinese University of Hong Kong, SenseTime
More informationDetection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 -
Lecture 11: Detection and Segmentation Lecture 11-1 May 10, 2017 Administrative Midterms being graded Please don t discuss midterms until next week - some students not yet taken A2 being graded Project
More informationMultispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks
Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks Jo rg Wagner1,2, Volker Fischer1, Michael Herman1 and Sven Behnke2 1- Robert Bosch GmbH - 70442 Stuttgart - Germany 2-
More informationLecture 23 Deep Learning: Segmentation
Lecture 23 Deep Learning: Segmentation COS 429: Computer Vision Thanks: most of these slides shamelessly adapted from Stanford CS231n: Convolutional Neural Networks for Visual Recognition Fei-Fei Li, Andrej
More informationarxiv: v1 [cs.cv] 15 Apr 2016
High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks arxiv:1604.04339v1 [cs.cv] 15 Apr 2016 Zifeng Wu, Chunhua Shen, Anton van den Hengel The University of Adelaide, SA 5005,
More informationDYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION
Journal of Advanced College of Engineering and Management, Vol. 3, 2017 DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION Anil Bhujel 1, Dibakar Raj Pant 2 1 Ministry of Information and
More informationtsushi Sasaki Fig. Flow diagram of panel structure recognition by specifying peripheral regions of each component in rectangles, and 3 types of detect
RECOGNITION OF NEL STRUCTURE IN COMIC IMGES USING FSTER R-CNN Hideaki Yanagisawa Hiroshi Watanabe Graduate School of Fundamental Science and Engineering, Waseda University BSTRCT For efficient e-comics
More informationarxiv: v1 [cs.cv] 27 Nov 2016
Real-Time Video Highlights for Yahoo Esports arxiv:1611.08780v1 [cs.cv] 27 Nov 2016 Yale Song Yahoo Research New York, USA yalesong@yahoo-inc.com Abstract Esports has gained global popularity in recent
More informationA Fast Method for Estimating Transient Scene Attributes
A Fast Method for Estimating Transient Scene Attributes Ryan Baltenberger, Menghua Zhai, Connor Greenwell, Scott Workman, Nathan Jacobs Department of Computer Science, University of Kentucky {rbalten,
More informationRecognition problems. Object Recognition. Readings. What is recognition?
Recognition problems Object Recognition Computer Vision CSE576, Spring 2008 Richard Szeliski What is it? Object and scene recognition Who is it? Identity recognition Where is it? Object detection What
More informationTRANSFORMING PHOTOS TO COMICS USING CONVOLUTIONAL NEURAL NETWORKS. Tsinghua University, China Cardiff University, UK
TRANSFORMING PHOTOS TO COMICS USING CONVOUTIONA NEURA NETWORKS Yang Chen Yu-Kun ai Yong-Jin iu Tsinghua University, China Cardiff University, UK ABSTRACT In this paper, inspired by Gatys s recent work,
More informationSECURITY EVENT RECOGNITION FOR VISUAL SURVEILLANCE
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-/W, 27 ISPRS Hannover Workshop: HRIGI 7 CMRT 7 ISA 7 EuroCOW 7, 6 9 June 27, Hannover, Germany SECURITY EVENT
More informationFinding people in repeated shots of the same scene
Finding people in repeated shots of the same scene Josef Sivic C. Lawrence Zitnick Richard Szeliski University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences of
More informationGESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING
2017 NDIA GROUND VEHICLE SYSTEMS ENGINEERING AND TECHNOLOGY SYMPOSIUM AUTONOMOUS GROUND SYSTEMS (AGS) TECHNICAL SESSION AUGUST 8-10, 2017 - NOVI, MICHIGAN GESTURE RECOGNITION FOR ROBOTIC CONTROL USING
More informationA Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16
A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1 pixels in, pixels out colorization Zhang et al.2016 monocular depth
More informationTaking Great Pictures (Automatically)
Taking Great Pictures (Automatically) Computational Photography (15-463/862) Yan Ke 11/27/2007 Anyone can take great pictures if you can recognize the good ones. Photo by Chang-er @ Flickr F8 and Be There
More informationToday. CS 395T Visual Recognition. Course content. Administration. Expectations. Paper reviews
Today CS 395T Visual Recognition Course logistics Overview Volunteers, prep for next week Thursday, January 18 Administration Class: Tues / Thurs 12:30-2 PM Instructor: Kristen Grauman grauman at cs.utexas.edu
More informationarxiv: v1 [cs.cv] 9 Nov 2015 Abstract
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding Alex Kendall Vijay Badrinarayanan University of Cambridge agk34, vb292, rc10001 @cam.ac.uk
More informationDriving Using End-to-End Deep Learning
Driving Using End-to-End Deep Learning Farzain Majeed farza@knights.ucf.edu Kishan Athrey kishan.athrey@knights.ucf.edu Dr. Mubarak Shah shah@crcv.ucf.edu Abstract This work explores the problem of autonomously
More informationConvolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3
Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 1 Olaf Ronneberger, Philipp Fischer, Thomas Brox (Freiburg, Germany) 2 Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (POSTECH,
More informationCompositing-aware Image Search
Compositing-aware Image Search Hengshuang Zhao 1, Xiaohui Shen 2, Zhe Lin 3, Kalyan Sunkavalli 3, Brian Price 3, Jiaya Jia 1,4 1 The Chinese University of Hong Kong, 2 ByteDance AI Lab, 3 Adobe Research,
More informationBook Cover Recognition Project
Book Cover Recognition Project Carolina Galleguillos Department of Computer Science University of California San Diego La Jolla, CA 92093-0404 cgallegu@cs.ucsd.edu Abstract The purpose of this project
More informationarxiv: v1 [cs.cv] 22 Oct 2017
Deep Cropping via Attention Box Prediction and Aesthetics Assessment Wenguan Wang, and Jianbing Shen Beijing Lab of Intelligent Information Technology, School of Computer Science, Beijing Institute of
More informationLecture 7: Scene Text Detection and Recognition. Dr. Cong Yao Megvii (Face++) Researcher
Lecture 7: Scene Text Detection and Recognition Dr. Cong Yao Megvii (Face++) Researcher yaocong@megvii.com Outline Background and Introduction Conventional Methods Deep Learning Methods Datasets and Competitions
More informationCan you tell a face from a HEVC bitstream?
Can you tell a face from a HEVC bitstream? Saeed Ranjbar Alvar, Hyomin Choi and Ivan V. Bajić School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada Email: {saeedr,chyomin, ibajic}@sfu.ca
More informationConvolu'onal Neural Networks. November 17, 2015
Convolu'onal Neural Networks November 17, 2015 Ar'ficial Neural Networks Feedforward neural networks Ar'ficial Neural Networks Feedforward, fully-connected neural networks Ar'ficial Neural Networks Feedforward,
More informationarxiv: v1 [cs.cv] 25 Sep 2018
Satellite Imagery Multiscale Rapid Detection with Windowed Networks Adam Van Etten In-Q-Tel CosmiQ Works avanetten@iqt.org arxiv:1809.09978v1 [cs.cv] 25 Sep 2018 Abstract Detecting small objects over large
More informationLecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2018
Lecture 1 Introduction to Computer Vision Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2018 Course Info Contact Information Room 408L, Jishi Building Email: cslinzhang@tongji.edu.cn
More informationAsking for Help with the Right Question by Predicting Human Visual Performance
Asking for Help with the Right Question by Predicting Human Visual Performance Hong Cai and Yasamin Mostofi Dept. of Electrical and Computer Engineering, University of California Santa Barbara {hcai, ymostofi}@ece.ucsb.edu
More informationContinuous Gesture Recognition Fact Sheet
Continuous Gesture Recognition Fact Sheet August 17, 2016 1 Team details Team name: ICT NHCI Team leader name: Xiujuan Chai Team leader address, phone number and email Address: No.6 Kexueyuan South Road
More informationRecognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 83
Recognition: Overview Sanja Fidler CSC420: Intro to Image Understanding 1/ 83 Textbook This book has a lot of material: K. Grauman and B. Leibe Visual Object Recognition Synthesis Lectures On Computer
More informationLecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2015
Lecture 1 Introduction to Computer Vision Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2015 Course Info Contact Information Room 314, Jishi Building Email: cslinzhang@tongji.edu.cn
More informationArtificial Intelligence Machine learning and Deep Learning: Trends and Tools. Dr. Shaona
Artificial Intelligence Machine learning and Deep Learning: Trends and Tools Dr. Shaona Ghosh @shaonaghosh What is Machine Learning? Computer algorithms that learn patterns in data automatically from large
More informationSemantic Localization of Indoor Places. Lukas Kuster
Semantic Localization of Indoor Places Lukas Kuster Motivation GPS for localization [7] 2 Motivation Indoor navigation [8] 3 Motivation Crowd sensing [9] 4 Motivation Targeted Advertisement [10] 5 Motivation
More informationFirst Person Action Recognition Using Deep Learned Descriptors
First Person Action Recognition Using Deep Learned Descriptors Suriya Singh 1 Chetan Arora 2 C. V. Jawahar 1 1 IIIT Hyderabad, India 2 IIIT Delhi, India Abstract We focus on the problem of wearer s action
More informationPerception. Read: AIMA Chapter 24 & Chapter HW#8 due today. Vision
11-25-2013 Perception Vision Read: AIMA Chapter 24 & Chapter 25.3 HW#8 due today visual aural haptic & tactile vestibular (balance: equilibrium, acceleration, and orientation wrt gravity) olfactory taste
More informationFace detection, face alignment, and face image parsing
Lecture overview Face detection, face alignment, and face image parsing Brandon M. Smith Guest Lecturer, CS 534 Monday, October 21, 2013 Brief introduction to local features Face detection Face alignment
More informationSpring 2018 CS543 / ECE549 Computer Vision. Course webpage URL:
Spring 2018 CS543 / ECE549 Computer Vision Course webpage URL: http://slazebni.cs.illinois.edu/spring18/ The goal of computer vision To extract meaning from pixels What we see What a computer sees Source:
More informationCROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen
CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS Kuan-Chuan Peng and Tsuhan Chen Cornell University School of Electrical and Computer Engineering Ithaca, NY 14850
More informationResearch Statement James Hays
James Hays 1/5 Research Statement James Hays (jhhays@cs.cmu.edu) Abstract: My research interests span computer graphics, computer vision, and the emerging field of computational photography. My current
More informationLearning Pixel-Distribution Prior with Wider Convolution for Image Denoising
Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising Peng Liu University of Florida pliu1@ufl.edu Ruogu Fang University of Florida ruogu.fang@bme.ufl.edu arxiv:177.9135v1 [cs.cv]
More informationMSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos
MSR Asia MSM at ActivityNet Challenge 2017: Trimmed Action Recognition, Temporal Action Proposals and Dense-Captioning Events in Videos Ting Yao, Yehao Li, Zhaofan Qiu, Fuchen Long, Yingwei Pan, Dong Li,
More informationMobile Cognitive Indoor Assistive Navigation for the Visually Impaired
1 Mobile Cognitive Indoor Assistive Navigation for the Visually Impaired Bing Li 1, Manjekar Budhai 2, Bowen Xiao 3, Liang Yang 1, Jizhong Xiao 1 1 Department of Electrical Engineering, The City College,
More informationDomain Adaptation & Transfer: All You Need to Use Simulation for Real
Domain Adaptation & Transfer: All You Need to Use Simulation for Real Boqing Gong Tecent AI Lab Department of Computer Science An intelligent robot Semantic segmentation of urban scenes Assign each pixel
More informationCascaded Feature Network for Semantic Segmentation of RGB-D Images
Cascaded Feature Network for Semantic Segmentation of RGB-D Images Di Lin1 Guangyong Chen2 Daniel Cohen-Or1,3 Pheng-Ann Heng2,4 Hui Huang1,4 1 Shenzhen University 2 The Chinese University of Hong Kong
More informationMS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World
MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao Microsoft; Redmond, WA 98052 Abstract Face recognition,
More informationDeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition ECE 289G: Paper Presentation #3 Philipp Gysel Autonomous Car ECE 289G Paper Presentation, Philipp Gysel Slide 2 Source: maps.google.com
More informationWadehra Kartik, Kathpalia Mukul, Bahl Vasudha, International Journal of Advance Research, Ideas and Innovations in Technology
ISSN: 2454-132X Impact factor: 4.295 (Volume 4, Issue 1) Available online at www.ijariit.com Hand Detection and Gesture Recognition in Real-Time Using Haar-Classification and Convolutional Neural Networks
More informationarxiv: v1 [cs.cv] 19 Apr 2018
Survey of Face Detection on Low-quality Images arxiv:1804.07362v1 [cs.cv] 19 Apr 2018 Yuqian Zhou, Ding Liu, Thomas Huang Beckmann Institute, University of Illinois at Urbana-Champaign, USA {yuqian2, dingliu2}@illinois.edu
More informationVirtual Worlds for the Perception and Control of Self-Driving Vehicles
Virtual Worlds for the Perception and Control of Self-Driving Vehicles Dr. Antonio M. López antonio@cvc.uab.es Index Context SYNTHIA: CVPR 16 SYNTHIA: Reloaded SYNTHIA: Evolutions CARLA Conclusions Index
More informationHand Gesture Recognition by Means of Region- Based Convolutional Neural Networks
Contemporary Engineering Sciences, Vol. 10, 2017, no. 27, 1329-1342 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ces.2017.710154 Hand Gesture Recognition by Means of Region- Based Convolutional
More informationLecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2014
Lecture 1 Introduction to Computer Vision Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2014 Course Info Contact Information Room 314, Jishi Building Email: cslinzhang@tongji.edu.cn
More informationSurgeon Technical Skill Assessment using Computer Vision based Analysis
Proceedings of Machine Learning for Healthcare 2017 JMLR W&C Track Volume 68 Surgeon Technical Skill Assessment using Computer Vision based Analysis Hei Law Computer Science and Engineering University
More informationComparison of Head Movement Recognition Algorithms in Immersive Virtual Reality Using Educative Mobile Application
Comparison of Head Recognition Algorithms in Immersive Virtual Reality Using Educative Mobile Application Nehemia Sugianto 1 and Elizabeth Irenne Yuwono 2 Ciputra University, Indonesia 1 nsugianto@ciputra.ac.id
More informationHyperspectral Image Denoising using Superpixels of Mean Band
Hyperspectral Image Denoising using Superpixels of Mean Band Letícia Cordeiro Stanford University lrsc@stanford.edu Abstract Denoising is an essential step in the hyperspectral image analysis process.
More informationDigital image processing vs. computer vision Higher-level anchoring
Digital image processing vs. computer vision Higher-level anchoring Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering, Department of Cybernetics Center for Machine Perception
More informationGoing Deeper into First-Person Activity Recognition
Going Deeper into First-Person Activity Recognition Minghuang Ma, Haoqi Fan and Kris M. Kitani Carnegie Mellon University Pittsburgh, PA 15213, USA minghuam@andrew.cmu.edu haoqif@andrew.cmu.edu kkitani@cs.cmu.edu
More informationUnderstanding Convolution for Semantic Segmentation
Understanding Convolution for Semantic Segmentation Panqu Wang 1, Pengfei Chen 1, Ye Yuan 2, Ding Liu 3, Zehua Huang 1, Xiaodi Hou 1, Garrison Cottrell 4 1 TuSimple, 2 Carnegie Mellon University, 3 University
More information1 st Keypoints Challenge. ImageNet and COCO Visual Recognition Challenges Workshop. Yin Cui, Tsung-Yi Lin, Matteo Ruggero Ronchi, Genevieve Patterson
1 st Keypoints Challenge Yin Cui, Tsung-Yi Lin, Matteo Ruggero Ronchi, Genevieve Patterson ImageNet and COCO Visual Recognition Challenges Workshop Sunday, October 9th, ECCV 2016 Dataset Dataset Statistics
More informationMulti-task Learning of Dish Detection and Calorie Estimation
Multi-task Learning of Dish Detection and Calorie Estimation Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585 JAPAN ABSTRACT In recent
More informationDEFOCUS BLUR PARAMETER ESTIMATION TECHNIQUE
International Journal of Electronics and Communication Engineering and Technology (IJECET) Volume 7, Issue 4, July-August 2016, pp. 85 90, Article ID: IJECET_07_04_010 Available online at http://www.iaeme.com/ijecet/issues.asp?jtype=ijecet&vtype=7&itype=4
More informationGesture Recognition with Real World Environment using Kinect: A Review
Gesture Recognition with Real World Environment using Kinect: A Review Prakash S. Sawai 1, Prof. V. K. Shandilya 2 P.G. Student, Department of Computer Science & Engineering, Sipna COET, Amravati, Maharashtra,
More informationTracking transmission of details in paintings
Tracking transmission of details in paintings Benoit Seguin benoit.seguin@epfl.ch Isabella di Lenardo isabella.dilenardo@epfl.ch Frédéric Kaplan frederic.kaplan@epfl.ch Introduction In previous articles
More informationUnderstanding Convolution for Semantic Segmentation
Understanding Convolution for Semantic Segmentation Panqu Wang 1, Pengfei Chen 1, Ye Yuan 2, Ding Liu 3, Zehua Huang 1, Xiaodi Hou 1, Garrison Cottrell 4 1 TuSimple, 2 Carnegie Mellon University, 3 University
More informationTeaching icub to recognize. objects. Giulia Pasquale. PhD student
Teaching icub to recognize RobotCub Consortium. All rights reservted. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. objects
More informationWhat Makes a Great Picture?
What Makes a Great Picture? Based on slides from 15-463: Computational Photography Alexei Efros, CMU, Spring 2010 With many slides from Yan Ke, as annotated by Tamara Berg National Geographic Video Below
More informationMARCO PEDERSOLI. Assistant Professor at ETS Montreal profs.etsmtl.ca/mpedersoli
MARCO PEDERSOLI Assistant Professor at ETS Montreal profs.etsmtl.ca/mpedersoli RESEARCH INTERESTS Visual Recognition, Efficient Deep Learning, Learning with Reduced Supervision, Data Exploration ACADEMIC
More informationFace Detection: A Literature Review
Face Detection: A Literature Review Dr.Vipulsangram.K.Kadam 1, Deepali G. Ganakwar 2 Professor, Department of Electronics Engineering, P.E.S. College of Engineering, Nagsenvana Aurangabad, Maharashtra,
More informationarxiv: v2 [cs.cv] 2 Feb 2018
Road Damage Detection Using Deep Neural Networks with Images Captured Through a Smartphone Hiroya Maeda, Yoshihide Sekimoto, Toshikazu Seto, Takehiro Kashiyama, Hiroshi Omata University of Tokyo, 4-6-1
More informationCorrelating Filter Diversity with Convolutional Neural Network Accuracy
Correlating Filter Diversity with Convolutional Neural Network Accuracy Casey A. Graff School of Computer Science and Engineering University of California San Diego La Jolla, CA 92023 Email: cagraff@ucsd.edu
More informationAnalysis and retrieval of events/actions and workflows in video streams
Multimed Tools Appl (2010) 50:1 6 DOI 10.1007/s11042-010-0514-2 GUEST EDITORIAL Analysis and retrieval of events/actions and workflows in video streams Anastasios D. Doulamis & Luc van Gool & Mark Nixon
More informationValue-added Applications with Deep Learning. src:
SMART TOURISM Value-added Applications with Deep Learning src: https://www.wttc.org/-/media/files/reports/economic-impact-research/countries-2017/thailand2017.pdf Somnuk Phon-Amnuaisuk, Minh-Son Dao, CIE,
More informationWhat Makes a Great Picture?
What Makes a Great Picture? Robert Doisneau, 1955 With many slides from Yan Ke, as annotated by Tamara Berg 15-463: Computational Photography Alexei Efros, CMU, Fall 2008 Photography 101 Composition Framing
More informationWhat Is And How Will Machine Learning Change Our Lives. Fair Use Agreement
What Is And How Will Machine Learning Change Our Lives Raymond Ptucha, Rochester Institute of Technology 2018 Engineering Symposium April 24, 2018, 9:45am Ptucha 18 1 Fair Use Agreement This agreement
More informationNatalia Vassilieva HP Labs Russia
Content Based Image Retrieval Natalia Vassilieva nvassilieva@hp.com HP Labs Russia 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Tutorial
More informationEn ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring
En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring Mathilde Ørstavik og Terje Midtbø Mathilde Ørstavik and Terje Midtbø, A New Era for Feature Extraction in Remotely Sensed
More informationarxiv: v2 [cs.cv] 25 Jul 2018
arxiv:1711.08496v2 [cs.cv] 25 Jul 2018 Temporal Relational Reasoning in Videos Bolei Zhou, Alex Andonian, Aude Oliva, Antonio Torralba MIT CSAIL {bzhou,aandonia,oliva,torralba}@csail.mit.edu Abstract.
More informationINTAIRACT: Joint Hand Gesture and Fingertip Classification for Touchless Interaction
INTAIRACT: Joint Hand Gesture and Fingertip Classification for Touchless Interaction Xavier Suau 1,MarcelAlcoverro 2, Adolfo Lopez-Mendez 3, Javier Ruiz-Hidalgo 2,andJosepCasas 3 1 Universitat Politécnica
More informationMachine Learning for Intelligent Transportation Systems
Machine Learning for Intelligent Transportation Systems Patrick Emami (CISE), Anand Rangarajan (CISE), Sanjay Ranka (CISE), Lily Elefteriadou (CE) MALT Lab, UFTI September 6, 2018 ITS - A Broad Perspective
More informationKrishnaCam: Using a Longitudinal, Single-Person, Egocentric Dataset for Scene Understanding Tasks
KrishnaCam: Using a Longitudinal, Single-Person, Egocentric Dataset for Scene Understanding Tasks Krishna Kumar Singh 1,3 Kayvon Fatahalian 1 Alexei A. Efros 2 1 Carnegie Mellon University 2 UC Berkeley
More informationIntroduction. BIL719 Computer Vision Pinar Duygulu Hacettepe University
Introduction BIL719 Computer Vision Pinar Duygulu Hacettepe University Basic Info Textbooks (suggested): Forsyth & Ponce, Computer Vision: A Modern Approach Richard Szeliski, Computer Vision: Algorithms
More informationarxiv: v1 [cs.cv] 28 Nov 2017 Abstract
Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks Zhaofan Qiu, Ting Yao, and Tao Mei University of Science and Technology of China, Hefei, China Microsoft Research, Beijing, China
More informationOBJECTIVE OF THE BOOK ORGANIZATION OF THE BOOK
xv Preface Advancement in technology leads to wide spread use of mounting cameras to capture video imagery. Such surveillance cameras are predominant in commercial institutions through recording the cameras
More informationDeep Learning. Dr. Johan Hagelbäck.
Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:
More informationImproving Robustness of Semantic Segmentation Models with Style Normalization
Improving Robustness of Semantic Segmentation Models with Style Normalization Evani Radiya-Dixit Department of Computer Science Stanford University evanir@stanford.edu Andrew Tierno Department of Computer
More informationCS 131 Lecture 1: Course introduction
CS 131 Lecture 1: Course introduction Olivier Moindrot Department of Computer Science Stanford University Stanford, CA 94305 olivierm@stanford.edu 1 What is computer vision? 1.1 Definition Two definitions
More informationLearning Rich Features for Image Manipulation Detection
Learning Rich Features for Image Manipulation Detection Peng Zhou Xintong Han Vlad I. Morariu Larry S. Davis University of Maryland, College Park Adobe Research pengzhou@umd.edu {xintong,lsd}@umiacs.umd.edu
More informationLiangliang Cao *, Jiebo Luo +, Thomas S. Huang *
Annotating ti Photo Collections by Label Propagation Liangliang Cao *, Jiebo Luo +, Thomas S. Huang * + Kodak Research Laboratories *University of Illinois at Urbana-Champaign (UIUC) ACM Multimedia 2008
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Presented by: Gordon Christie 1 Overview Reinterpret standard classification convnets as
More informationDSNet: An Efficient CNN for Road Scene Segmentation
DSNet: An Efficient CNN for Road Scene Segmentation Ping-Rong Chen 1 Hsueh-Ming Hang 1 1 National Chiao Tung University {james50120.ee05g, hmhang}@nctu.edu.tw Sheng-Wei Chan 2 Jing-Jhih Lin 2 2 Industrial
More informationPersonal Driving Diary: Constructing a Video Archive of Everyday Driving Events
Proceedings of IEEE Workshop on Applications of Computer Vision (WACV), Kona, Hawaii, January 2011 Personal Driving Diary: Constructing a Video Archive of Everyday Driving Events M. S. Ryoo, Jae-Yeong
More informationVideo Enhancement & Suspicious Object Detection In Low Quality Video Frames
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 8, Issue 2, Ver. I (Mar.-Apr. 2018), PP 53-57 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Video Enhancement & Suspicious
More informationDoes Haze Removal Help CNN-based Image Classification?
Does Haze Removal Help CNN-based Image Classification? Yanting Pei 1,2, Yaping Huang 1,, Qi Zou 1, Yuhang Lu 2, and Song Wang 2,3, 1 Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing
More informationMoSculp: Interactive Visualization of Shape and Time
MoSculp: Interactive Visualization of Shape and Time Xiuming Tali Tianfan Andrew Qiurui Jiajun Stefanie William T. Zhang 1 Dekel 1,2 Xue 1,2 Owens 1,3 He 1,2 Wu 1 Mueller 1 Freeman 1,2 1 MIT CSAIL 2 Google
More informationRecognition: Overview. Sanja Fidler CSC420: Intro to Image Understanding 1/ 78
Recognition: Overview Sanja Fidler CSC420: Intro to Image Understanding 1/ 78 Textbook This book has a lot of material: K. Grauman and B. Leibe Visual Object Recognition Synthesis Lectures On Computer
More informationON CLASSIFICATION OF DISTORTED IMAGES WITH DEEP CONVOLUTIONAL NEURAL NETWORKS. Yiren Zhou, Sibo Song, Ngai-Man Cheung
ON CLASSIFICATION OF DISTORTED IMAGES WITH DEEP CONVOLUTIONAL NEURAL NETWORKS Yiren Zhou, Sibo Song, Ngai-Man Cheung Singapore University of Technology and Design In this section, we briefly introduce
More informationLight-Field Database Creation and Depth Estimation
Light-Field Database Creation and Depth Estimation Abhilash Sunder Raj abhisr@stanford.edu Michael Lowney mlowney@stanford.edu Raj Shah shahraj@stanford.edu Abstract Light-field imaging research has been
More informationTraining Steps Files File Type File Count Total Size L3 embedding knowledge distillation (SONYC) Google audioset (environmental)
PI: Justification for 30 TB Storage Request (1) Project space needs and file sizes This storage request is in relation to our ongoing effort in training deep learning models on large datasets in non-speech
More information