Perception. Introduction to HRI Simmons & Nourbakhsh Spring 2015

Similar documents
Midterm Examination CS 534: Computational Photography

Study guide for Graduate Computer Vision

Computer Vision Slides curtesy of Professor Gregory Dudek

Efficient Construction of SIFT Multi-Scale Image Pyramids for Embedded Robot Vision

CSC 320 H1S CSC320 Exam Study Guide (Last updated: April 2, 2015) Winter 2015

Image Formation and Capture

Exercise questions for Machine vision

Evaluating the stability of SIFT keypoints across cameras

Perception. Read: AIMA Chapter 24 & Chapter HW#8 due today. Vision

Improved SIFT Matching for Image Pairs with a Scale Difference

EC-433 Digital Image Processing

Lecture 19: Depth Cameras. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Computer Vision. Howie Choset Introduction to Robotics

23270: AUGMENTED REALITY FOR NAVIGATION AND INFORMATIONAL ADAS. Sergii Bykov Technical Lead Machine Learning 12 Oct 2017

MEM455/800 Robotics II/Advance Robotics Winter 2009

Range Sensing strategies

SUPER RESOLUTION INTRODUCTION

VC 11/12 T2 Image Formation

Table of contents. Vision industrielle 2002/2003. Local and semi-local smoothing. Linear noise filtering: example. Convolution: introduction

[2] Brajovic, V. and T. Kanade, Computational Sensors for Global Operations, IUS Proceedings,

Vision Review: Image Processing. Course web page:

Optical Flow Estimation. Using High Frame Rate Sequences

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

VC 14/15 TP2 Image Formation

Computational Cameras. Rahul Raguram COMP

Design of Temporally Dithered Codes for Increased Depth of Field in Structured Light Systems

Digital Image Processing

8.2 IMAGE PROCESSING VERSUS IMAGE ANALYSIS Image processing: The collection of routines and

ELEC Dr Reji Mathew Electrical Engineering UNSW

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye

Image Processing for feature extraction

Digital Image Processing

Dappled Photography: Mask Enhanced Cameras for Heterodyned Light Fields and Coded Aperture Refocusing

Image acquisition. In both cases, the digital sensing element is one of the following: Line array Area array. Single sensor

A Sorting Image Sensor: An Example of Massively Parallel Intensity to Time Processing for Low Latency Computational Sensors

Cameras have finite depth of field or depth of focus

A Foveated Visual Tracking Chip

Image Processing COS 426

Image Processing by Bilateral Filtering Method

IMAGE PROCESSING PAPER PRESENTATION ON IMAGE PROCESSING

Object Recognition + Gesture Recognition

Digital Photographs and Matrices

Deconvolution , , Computational Photography Fall 2018, Lecture 12

General Imaging System

SYLLABUS CHAPTER - 2 : INTENSITY TRANSFORMATIONS. Some Basic Intensity Transformation Functions, Histogram Processing.

Video Synthesis System for Monitoring Closed Sections 1

Cvision 2. António J. R. Neves João Paulo Silva Cunha. Bernardo Cunha. IEETA / Universidade de Aveiro

Lecture 18: Light field cameras. (plenoptic cameras) Visual Computing Systems CMU , Fall 2013

High Speed vslam Using System-on-Chip Based Vision. Jörgen Lidholm Mälardalen University Västerås, Sweden


COS Lecture 1 Autonomous Robot Navigation

Introduction. Prof. Lina Karam School of Electrical, Computer, & Energy Engineering Arizona State University

Basic principles of photography. David Capel 346B IST

ME 6406 MACHINE VISION. Georgia Institute of Technology

VICs: A Modular Vision-Based HCI Framework

How does prism technology help to achieve superior color image quality?

VC 16/17 TP2 Image Formation

Perception. What We Will Cover in This Section. Perception. How we interpret the information our senses receive. Overview Perception

EXPERIMENTAL BILATERAL CONTROL TELEMANIPULATION USING A VIRTUAL EXOSKELETON

Image Enhancement. DD2423 Image Analysis and Computer Vision. Computational Vision and Active Perception School of Computer Science and Communication

Lecture 8. Human Information Processing (1) CENG 412-Human Factors in Engineering May

Cameras. Shrinking the aperture. Camera trial #1. Pinhole camera. Digital Visual Effects Yung-Yu Chuang. Put a piece of film in front of an object.

Depth Perception with a Single Camera

An Autonomous Vehicle Navigation System using Panoramic Machine Vision Techniques

For a long time I limited myself to one color as a form of discipline. Pablo Picasso. Color Image Processing

The introduction and background in the previous chapters provided context in

Topic 9 - Sensors Within

Image Demosaicing. Chapter Introduction. Ruiwen Zhen and Robert L. Stevenson

Linear Gaussian Method to Detect Blurry Digital Images using SIFT

International Journal of Innovative Research in Engineering Science and Technology APRIL 2018 ISSN X

Sensing and Perception

Introduction to Computer Vision

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

ROAD TO THE BEST ALPR IMAGES

Reverse Engineering the Human Vision System

Einführung in die Erweiterte Realität. 5. Head-Mounted Displays

Image Formation: Camera Model

Real- Time Computer Vision and Robotics Using Analog VLSI Circuits

ROBOT VISION. Dr.M.Madhavi, MED, MVSREC

Single Camera Catadioptric Stereo System

Toward Non-stationary Blind Image Deblurring: Models and Techniques

FLASH LiDAR KEY BENEFITS

CPSC 425: Computer Vision

IMAGE ENHANCEMENT IN SPATIAL DOMAIN

6 Color Image Processing

Vishnu Nath. Usage of computer vision and humanoid robotics to create autonomous robots. (Ximea Currera RL04C Camera Kit)

Neural Networks The New Moore s Law

Digital images. Digital Image Processing Fundamentals. Digital images. Varieties of digital images. Dr. Edmund Lam. ELEC4245: Digital Image Processing

Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester

10mW CMOS Retina and Classifier for Handheld, 1000Images/s Optical Character Recognition System

EE 392B: Course Introduction

Focusing and Metering

Lecture 2: Digital Image Fundamentals -- Sampling & Quantization

the dimensionality of the world Travelling through Space and Time Learning Outcomes Johannes M. Zanker

Digital Photographs, Image Sensors and Matrices

CMVision and Color Segmentation. CSE398/498 Robocup 19 Jan 05

International Journal of Advance Engineering and Research Development CONTRAST ENHANCEMENT OF IMAGES USING IMAGE FUSION BASED ON LAPLACIAN PYRAMID

Acquisition Basics. How can we measure material properties? Goal of this Section. Special Purpose Tools. General Purpose Tools

LAB MANUAL SUBJECT: IMAGE PROCESSING BE (COMPUTER) SEM VII

Transcription:

Perception Introduction to HRI Simmons & Nourbakhsh Spring 2015

Perception my goals What is the state of the art boundary? Where might we be in 5-10 years?

The Perceptual Pipeline The classical approach: a serial pipeline Weak link analysis: each step depends on predecessors

Social Perception What features do we perceive for sociality? Is social perception a serial pipeline?

1. HRI for Human Perceptual Shifting

Insect Telepresence Educational telepresence designed using formal HCI inquiry tools.

Insect Telepresence Robot Problem Increase visitors engagement with and appreciation of insects in a museum terrarium at CMNH. Approach Provide a scalar telepresence experience with insect-safe visual browsing Apply HCI techniques to design and evaluate the input device and system Cultural modeling, expert interview, baseline observation Measure engagement indirectly by time on task Partner with HCII, CMNH

Insect Telepresence Robot Innovations Asymmetric exhibit layout Mechanical transparency Clutched gantry lever arm FOV-relative 3 DOF joystick

Insect Telepresence Robot

Insect Telepresence Robot Evaluation Results: Average group size: 3 Average age of users: 19.5 years Three age modes: 8 years, 10 years, and 35 years Average time on task of all users: 60 seconds Average time on task of a single user: 27 seconds Average time on task for user groups: 93 seconds Illah Nourbakhsh CMU Robotics Institute HRI Summer Course

2. Vision Sensors

The CCD (Charged Couple Device) - Exotic timing circuitry required - Uneven frequency response in electron wells - Color separation: filters versus splitting - Lossy data formats: NTSC and digital video > Credit: http://www.shortcourses.com/how/sensors/ccd_readout.gif

The CMOS (Complementary Metal Oxide Semiconductor) - Standard chip fabrication techniques - Far lower power consumption overall (1:100) - Pixel/well measurement circuitry at along pixel - Real estate problems ; efficiency of photon usage

Human Vision High quality sensors color depth, dynamic range, light sensitivity, etc. Massive information fusion parallelism context-based reasoning active foveation and selective attention selective sensor fusion over space, capability and time tuned feedback from interpretation to first computation elegant and gradual failure characteristics

3. Machine Vision Poor-performance sensors 8/24 bits of color, little dynamic range, inaccuracy and warp, inconstant properties Narrow, shallow, fragile serial information processing information context typically as assumptions that violate little sensor fusion across type little sensor feedback loops across levels of interpretation very little temporal filtering and interpretation

Origins: Shakey

Origins: The Stanford Cart

Origins: The Stanford Cart

Passive versus Active Tradeoff The Passive/Active Design Question Sufficiency of natural contrast Interference between multiple robots System works in the dark System works in bright sunlight

Visual Ranging for Social Interaction Totally safe obstacle detection Human-body spatial interaction Arms and gesture recognition Human-designed environment engagement

Vision-based Rangefinding Imaging chips collapse a 3D world onto a 2D plane Range inference from world knowledge / logical reasoning Range inference from camera parameters Range inference from disparity / matching

1/f = 1/d + 1/e Depth from Defocus

Depth from Defocus Pinhole camera no blurring Blur circle sensitivity inversely proportional to distance To calculate distance we must know focused image

Depth from Defocus

Depth from Disparity

Depth from Disparity Distance is inversely proportional to disparity Disparity is proportional to baseline Large baselines offer a tradeoff across range

The Feature Challenge Features must: provide sufficient density match across small viewpoint changes match across partial occlusions identify confidence Features must not: trigger false positive matches prove too sparse for the robot s task require on-line human tuning

Example: ZLoG Zero crossings of Laplacian of Gaussian Laplacian: second derivative convolution Gaussian: smoothing convolution Zero crossings: a sharp feature for interpolation

Stereo: Pictorial Example

Active Rangefinding

HRI Vision: the special-case approach

Example: Cueing in Kismet Color-based human-robot interaction Cueing, orthogonal events, child-based interaction Challenges: constancy, illumination, human expectation

Motivational example: RALPH

Navlab on Streets

Chips Museum Edubot - Chips Carnegie Museum of Natural History Autonomy 5 years, > 500 km navigated, auto-docking MTBF convergence at 1 week Proactive health state identification

Museum Edubot - Chips

Landmarks: Visual Fiducials

Minerva: an example of focused vision

Minerva: an example of focused vision

When special-case fails

SLAM

Visual SLAM Considerations Repeatable landmark recognition Feature locale Map-making Tracking robot position

The Future of Visual Navigation Hans Moravec s stereo-based voxel grid

Invariant features SIFT Features: image contents coded so they can be found again on other images of same scene, Invariant: despite many changes: rotation, translation camera viewpoint: scale, perspective illumination noise occlusion Image matching by comparing invariant features Notion of Interesting points and Keypoints

Gaussian pyramid Scale smoothing parameter Increase -> no need to retain all pixels Stored image can be reduced in size Increasing sigma Gaussian pyramid

1. Scale-space extrema detection Gaussian Pyramid processed one octave at a time Blurs DoGs

2. Keypoint localization Detect maxima and minima of difference-of-gaussian in scale space Reject points lying on edges Fit a quadratic to surrounding values for sub-pixel and subscale interpolation

4. SIFT vector formation Thresholded image gradients are sampled over 16x16 array of locations in scale space Create array of orientation histograms 8 orientations x 4x4 histogram array = 128 dimensions

Keypoints Sampled regions located at interest points Local invariant descriptors to scale and rotation ( ) local descriptor Local: robust to occlusion/clutter + no segmentation Invariant: to image transformations + illumination changes

SIFT Features Very powerful method developed by David Lowe, Vancouver Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters SIFT Features

SIFT

Example: K9 Science Rover

Example: K9 Science Rover s SIFT

4. Social Vision State of Art Face detection, recognition Speech understanding Gesture understanding

Face Detection How would you detect faces in images?

Face Detection How would you detect faces in images?

Face Detection How would you detect faces in images?

Expression Detection

First Person Vision

Speech and Gesture Understanding Time for some fun: http://www.youtube.com/watch?v=1s-piibzbhw