Vision-based User-interfaces for Pervasive Computing. CHI 2003 Tutorial Notes. Trevor Darrell Vision Interface Group MIT AI Lab

Similar documents
Short Course on Computational Illumination

Perceptual Interfaces. Matthew Turk s (UCSB) and George G. Robertson s (Microsoft Research) slides on perceptual p interfaces

Hand & Upper Body Based Hybrid Gesture Recognition

Vision-Based Speaker Detection Using Bayesian Networks

Imaging Process (review)

Effects of the Unscented Kalman Filter Process for High Performance Face Detector

Computer Vision Based Chess Playing Capabilities for the Baxter Humanoid Robot

An Un-awarely Collected Real World Face Database: The ISL-Door Face Database

A SURVEY ON GESTURE RECOGNITION TECHNOLOGY

Analysis of Various Methodology of Hand Gesture Recognition System using MATLAB

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Experiences of Research on Vision Based Interfaces at the MIT Media Lab

Real-Time Face Detection and Tracking for High Resolution Smart Camera System

Multi-Resolution Estimation of Optical Flow on Vehicle Tracking under Unpredictable Environments

Emotion Based Music Player

Research Seminar. Stefano CARRINO fr.ch

Visual information is clearly important as people IN THE INTERFACE

Interface Design V: Beyond the Desktop

CROWD ANALYSIS WITH FISH EYE CAMERA

Symbiotic Interfaces For Wearable Face Recognition

Semantic Localization of Indoor Places. Lukas Kuster

Image Processing : Introduction

Stereo-based Hand Gesture Tracking and Recognition in Immersive Stereoscopic Displays. Habib Abi-Rached Thursday 17 February 2005.

Vision for a Smart Kiosk

A Real Time Static & Dynamic Hand Gesture Recognition System

International Journal of Research in Computer and Communication Technology, Vol 2, Issue 12, December- 2013

A Proposal for Security Oversight at Automated Teller Machine System

Portfolio. Swaroop Kumar Pal swarooppal.wordpress.com github.com/swarooppal1088

Combining Audio and Video in Perceptive Spaces

CSE Tue 10/09. Nadir Weibel

AR Tamagotchi : Animate Everything Around Us

Controlling Humanoid Robot Using Head Movements

Navigation of PowerPoint Using Hand Gestures

Gesticulation Based Smart Surface with Enhanced Biometric Security Using Raspberry Pi

Non Verbal Communication of Emotions in Social Robots

User Interface Agents

RESEARCH AND DEVELOPMENT OF DSP-BASED FACE RECOGNITION SYSTEM FOR ROBOTIC REHABILITATION NURSING BEDS

Pose Invariant Face Recognition

DepthTouch: Using Depth-Sensing Camera to Enable Freehand Interactions On and Above the Interactive Surface

Service Robots in an Intelligent House

Color. Used heavily in human vision. Color is a pixel property, making some recognition problems easy

Gesture Recognition with Real World Environment using Kinect: A Review

Face Detection: A Literature Review

ACTIVE: Abstract Creative Tools for Interactive Video Environments

Color. Used heavily in human vision. Color is a pixel property, making some recognition problems easy

A Comparison of Histogram and Template Matching for Face Verification

A Method for Temporal Hand Gesture Recognition

Baset Adult-Size 2016 Team Description Paper

Implementation of Neural Network Algorithm for Face Detection Using MATLAB

EE631 Cooperating Autonomous Mobile Robots. Lecture 1: Introduction. Prof. Yi Guo ECE Department

Module Contact: Dr Barry-John Theobald, CMP Copyright of the University of East Anglia Version 1

ISCW 2001 Tutorial. An Introduction to Augmented Reality

YDDON. Humans, Robots, & Intelligent Objects New communication approaches

* Intelli Robotic Wheel Chair for Specialty Operations & Physically Challenged

Applying Vision to Intelligent Human-Computer Interaction

Camera Based Hand Gesture Recognition Daniel Snowden B.Sc. Computing with Artificial Intelligence 2005/2006

Visual Interpretation of Hand Gestures as a Practical Interface Modality

FP7 ICT Call 6: Cognitive Systems and Robotics

Color: Readings: Ch 6: color spaces color histograms color segmentation

White Intensity = 1. Black Intensity = 0

Automatics Vehicle License Plate Recognition using MATLAB

A VIDEO CAMERA ROAD SIGN SYSTEM OF THE EARLY WARNING FROM COLLISION WITH THE WILD ANIMALS

An Autonomous Vehicle Navigation System using Panoramic Machine Vision Techniques

NCCT IEEE PROJECTS ADVANCED ROBOTICS SOLUTIONS. Latest Projects, in various Domains. Promise for the Best Projects

AUGMENTED REALITY APPLICATIONS USING VISUAL TRACKING

Intelligent Power Economy System (Ipes)

2. Visually- Guided Grasping (3D)

Simulation of a mobile robot navigation system

Chapter 3 Part 2 Color image processing

Community Update and Next Steps

Research of an Algorithm on Face Detection

Face Detector using Network-based Services for a Remote Robot Application

Development of a Laboratory Kit for Robotics Engineering Education

From Conversational Tooltips to Grounded Discourse: Head Pose Tracking in Interactive Dialog Systems

Webcam Based Image Control System

OPEN CV BASED AUTONOMOUS RC-CAR

Applied Surveillance using Biometrics on Agents Infrastructures

COS Lecture 1 Autonomous Robot Navigation

Browsing 3-D spaces with 3-D vision: body-driven navigation through the Internet city

Advanced Robotics Introduction

Artificial Beacons with RGB-D Environment Mapping for Indoor Mobile Robot Localization

A New Social Emotion Estimating Method by Measuring Micro-movement of Human Bust

GUI and Gestures. CS334 Fall Daniel G. Aliaga Department of Computer Science Purdue University

Session 2: 10 Year Vision session (11:00-12:20) - Tuesday. Session 3: Poster Highlights A (14:00-15:00) - Tuesday 20 posters (3minutes per poster)

Color Image Processing

Design a Model and Algorithm for multi Way Gesture Recognition using Motion and Image Comparison

B. Kisacanin, V. Pavlovic, and T. Huang (eds.), Real-Time Vision for Human- Computer Interaction, Springer, August 2005.

II. LITERATURE SURVEY

Perception. Introduction to HRI Simmons & Nourbakhsh Spring 2015

Today. CS 395T Visual Recognition. Course content. Administration. Expectations. Paper reviews

International Journal of Informative & Futuristic Research ISSN (Online):

Prospective Teleautonomy For EOD Operations

EFFICIENT ATTENDANCE MANAGEMENT SYSTEM USING FACE DETECTION AND RECOGNITION

Vision Review: Image Processing. Course web page:

LOOK WHO S TALKING: SPEAKER DETECTION USING VIDEO AND AUDIO CORRELATION. Ross Cutler and Larry Davis

1 Introduction Digital video cameras and computers have come into wide use recently but visual surveillance for security is still mainly performed by

CymbIoT Visual Analytics

SCIENCE & TECHNOLOGY

A SURVEY ON HAND GESTURE RECOGNITION

VICs: A Modular Vision-Based HCI Framework

Transcription:

Vision-based User-interfaces for Pervasive Computing Tutorial Notes Vision Interface Group MIT AI Lab

Table of contents Biographical sketch..ii Agenda..iii Objectives.. iv Abstract..v Introduction....1 Finding Faces....6 Tracking Pose..40 Hand Gestures.....59 Full Body Interaction... 66 i

Biographical Sketch Prof. leads the Vision Interface group at the MIT Artificial Intelligence Laboratory and has an appointment in the Electrical Engineering and Computer Science Department. Prior to joining the faculty of MIT in 1999, he worked as a member of research staff at the Interval Research Corp. in Palo Alto, CA. He received his PhD from the MIT Media Arts and Sciences Program in 1996. At the Media Lab he developed several interactive systems using real-time vision including the ALIVE system for interaction with virtual worlds, and systems for real-time hand gesture and facial expression recognition. ii

Agenda 14:00 Welcome and Overview 14:15 Responding to Faces 15:15 Tracking Hands and Gestures 16:00 Interacting with Arms 16:30 Brainstorming Activity 17:00 Privacy Issues 17:20 Conclusion iii

Perceptive User Interfaces Free users from desktop and wired interfaces Allow natural gesture and speech commands Give computers awareness of users Work in open and noisy environments Vision s role: provide perceptual context Perceptual Context Who is there? (presence, identity) Which person said that? (audiovisual grouping) Where are they? (location) What are they looking / pointing at? (pose, gaze) What are they doing? (activity) Perceptual context should be provided across platforms (PDA, Desktop, Environment)

CV Face and Gesture Literature PUI Workshop FG Conferences CVPR, ICCV, ICPR. Three metaphors Perceptual displays Gloveless hand tracking Smart environments

Perceptually Aware Displays Camera associated with display Display should respond to user font size attentional load passive acknowledgement Camera Display e.g., Magic Mirror, Interval Compaq s Smart Kiosk ALIVE, MIT Media Lab Perception-based manipulation Gloveless VR hand tracking Track hand position(s) in 3-D Usually desktop-based interface Control virtual character Navigation, etc. e.g., Wren and Pentland, MIT Media Lab

Intelligent Environments From PUI to Pervasive Computing Integrate multiple perceptual algorithms. No single point of interaction (desktop/screen) Offices & homes with Vision-based detection, ID, and tracking of occupants Speech interface to recognize commands and perform keyword indexing Applications meeting recording; activity-dependent indexing active videoconferencing; presence; abstract messaging eldercare/childcare Microsoft EasyLiving Project [ Shafer, Brumitt, Krumm et al.]

Other smart environment projects GaTech AwareHome MIT AI Lab Intelligent Room MIT Media Lab Facilitator SRI Today s Topics Face Detection and Recognition Head Pose Estimation Eye Gaze Tracking Face Expression Hand Tracking Gesture Recognition Privacy Issues

Face/Body detection approaches Silhouette Flesh Color Pattern Face/Body detection approaches Silhouette Flesh Color Pattern

Classic Background Subtraction model Background is assumed to be mostly static Each pixel is modeled as by a gaussian distribution in YUV space Model mean is usually updated using a recursive low-pass filter Given new image, generate silhouette by marking those pixels that are significantly different from the background value. Finding Features 2D Head / hands localization contour analysis: mark extremal points (highest curvature or distance from center of body) as hand features use skin color model when region of hand or face is found (color model is independent of flesh tone intensity)

Static Background Modeling Examples [MIT Media Lab Pfinder / ALIVE System] Static Background Modeling Examples [MIT Media Lab Pfinder / ALIVE System]

Static Background Modeling Examples [MIT Media Lab Pfinder / ALIVE System] The ALIVE System Camera Video Screen User Autonomous Agents

ALIVE Real sensing for virtual world Tightly coupled sensing-behavior-action Vision routines: body/head/hand tracking Vision Camera Behaviors / Goals Projector Kinematics / Rendering User Agents [ Blumberg, Darrell, Maes, Pentland, Wren, 1995 ] General Background modeling Outdoor analysis; richer model of per-pixel background variation MIT AI Lab VSAM project [ Grimson and Stauffer ] UMD W4 project [ Davis ] Key assumption: static background How to deal with crowded environments and dynamic backgrounds?

Video-Rate Stereo Twocameras > stereo range estimation; disparity proportional to depth Depth makes tracking people easy segmentation shape characterization pose tracking Real-time implementations becoming commercially available Stereo range estimation Computed disparity Grouping by local connectivity Left and right images

RGBZ input RGBZ input

RGBZ input Range feature for ID Body shape characteristics -- e.g., height measure. Normalize for motion/pose: median filter over time Trevor Mike Gaile Near future: full vision-based kinematic estimation and tracking--active research topic in many labs.

Face/Body detection approaches Silhouette Flesh Color Pattern Flesh color tracking Often the simplest, fastest face detector! Initialize region of hue space [Crowley,Coutaz,Berard,INRIA]

Color Processing Train two-class classifier with examples of skin and not skin Typical approaches: Gaussian, Neural Net, Nearest Neighbor Use features invariant to intensity Log color-opponent [Fleck et al.] (log(r) - log(g), log(b) - log((r+g)/2) ) Hue & Saturation Flesh color tracking Can use Intel OpenCV lib s CAMSHIFT algorithm for robust real-time tracking. (open source impl. avail.!) [Bradsky,Intel]

Flesh color tracking MIT Media Lab s Lafter--simultaneous face and lip hue tracking. [ Oliver and Pentland ] Color feature for ID For long-term tracking / identification, measure color hue and saturation values of hair and skin. Gaile Mike Trevor For same-day ID, use histogram of entire body / clothing

Face/Body detection approaches Silhouette Flesh Color Pattern Pattern Recognition Face Detection: Determine location and size of any human face in input image given greyscale patch. (2-class) [ Sung and Poggio; Rowley and Kanade ] Face Recognition: Compare input face image against models in library, report best match. (n-class) [ Turk and Pentland; Cootes and Taylor; and many others ]

Pattern Recognition Face Detection: Determine location and size of any human face in input image given greyscale patch. (2-class) [ Sung and Poggio; Rowley and Kanade ] Face Recognition: Compare input face image against models in library, report best match. (n-class) [ Turk and Pentland; Cootes and Taylor; and many others ] Image Basics 35 39 45 68 88 36 43 62 43 55 33 43 55 52 51

Template Matching Classic approach Given input image, compare template image at various offsets Various distance metrics MSE Correllation Template Matching E= - 2

Multi-scale search Search at multiple scales (and pose) Multiple templates Single template, multiple scales Image Pyramid decimate image by constant factor efficient search Template Matching Works for single (or similar) individuals, cannonical pose and lighting. Common extensions prenormalization Multiple templates Subfeatures How to choose?