Perceptual Interfaces. Matthew Turk s (UCSB) and George G. Robertson s (Microsoft Research) slides on perceptual p interfaces

Similar documents
Short Course on Computational Illumination

Multi-Modal User Interaction

ENHANCED HUMAN-AGENT INTERACTION: AUGMENTING INTERACTION MODELS WITH EMBODIED AGENTS BY SERAFIN BENTO. MASTER OF SCIENCE in INFORMATION SYSTEMS

Introduction to Haptics

R (2) Controlling System Application with hands by identifying movements through Camera

Vision-based User-interfaces for Pervasive Computing. CHI 2003 Tutorial Notes. Trevor Darrell Vision Interface Group MIT AI Lab

HUMAN-COMPUTER INTERACTION: OVERVIEW ON STATE OF THE ART TECHNOLOGY

User Interface Agents

Ubiquitous Computing Summer Episode 16: HCI. Hannes Frey and Peter Sturm University of Trier. Hannes Frey and Peter Sturm, University of Trier 1

Interface Design V: Beyond the Desktop

PERCEPTUAL INTERFACES

Unit 23. QCF Level 3 Extended Certificate Unit 23 Human Computer Interaction

HUMAN-COMPUTER INTERACTION: OVERVIEW ON STATE OF THE ART

Auto und Umwelt - das Auto als Plattform für Interaktive

VIRTUAL REALITY Introduction. Emil M. Petriu SITE, University of Ottawa

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Building Perceptive Robots with INTEL Euclid Development kit

CS415 Human Computer Interaction

Session 2: 10 Year Vision session (11:00-12:20) - Tuesday. Session 3: Poster Highlights A (14:00-15:00) - Tuesday 20 posters (3minutes per poster)

Mario Romero 2014/11/05. Multimodal Interaction and Interfaces Mixed Reality

YDDON. Humans, Robots, & Intelligent Objects New communication approaches

Definitions of Ambient Intelligence

A Brief Survey of HCI Technology. Lecture #3

6 Ubiquitous User Interfaces

Perceptual User Interfaces

MECHANICAL DESIGN LEARNING ENVIRONMENTS BASED ON VIRTUAL REALITY TECHNOLOGIES

This list supersedes the one published in the November 2002 issue of CR.

RV - AULA 05 - PSI3502/2018. User Experience, Human Computer Interaction and UI

Universidade de Aveiro Departamento de Electrónica, Telecomunicações e Informática. Interaction in Virtual and Augmented Reality 3DUIs

Issues and Challenges of 3D User Interfaces: Effects of Distraction

Effective Iconography....convey ideas without words; attract attention...

Introduction to HCI. CS4HC3 / SE4HC3/ SE6DO3 Fall Instructor: Kevin Browne

LCC 3710 Principles of Interaction Design. Readings. Sound in Interfaces. Speech Interfaces. Speech Applications. Motivation for Speech Interfaces

INTERACTION AND SOCIAL ISSUES IN A HUMAN-CENTERED REACTIVE ENVIRONMENT

Advancements in Gesture Recognition Technology

Computer Vision in Human-Computer Interaction

Gesture Recognition with Real World Environment using Kinect: A Review

Haptics CS327A

Advances in Human!!!!! Computer Interaction

ELG 5121/CSI 7631 Fall Projects Overview. Projects List

ARMY RDT&E BUDGET ITEM JUSTIFICATION (R2 Exhibit)

Heads up interaction: glasgow university multimodal research. Eve Hoggan

CS415 Human Computer Interaction

HAPTICS AND AUTOMOTIVE HMI

2. Introduction to Computer Haptics

CSE 165: 3D User Interaction. Lecture #14: 3D UI Design

Definitions and Application Areas

FP7 ICT Call 6: Cognitive Systems and Robotics

Cognitive robots and emotional intelligence Cloud robotics Ethical, legal and social issues of robotic Construction robots Human activities in many

3D User Interaction CS-525U: Robert W. Lindeman. Intro to 3D UI. Department of Computer Science. Worcester Polytechnic Institute.

INDE/TC 455: User Interface Design

What was the first gestural interface?

E90 Project Proposal. 6 December 2006 Paul Azunre Thomas Murray David Wright

Towards Intuitive Industrial Human-Robot Collaboration

Autonomic gaze control of avatars using voice information in virtual space voice chat system

EE631 Cooperating Autonomous Mobile Robots. Lecture 1: Introduction. Prof. Yi Guo ECE Department

Chapter 2 Introduction to Haptics 2.1 Definition of Haptics

COS Lecture 7 Autonomous Robot Navigation

A SURVEY OF SOCIALLY INTERACTIVE ROBOTS

Haptic presentation of 3D objects in virtual reality for the visually disabled

Booklet of teaching units

Digital image processing vs. computer vision Higher-level anchoring

Research Seminar. Stefano CARRINO fr.ch

3D Virtual Training Systems Architecture

Realtime 3D Computer Graphics Virtual Reality

Motivation and objectives of the proposed study

Artificial Intelligence and Robotics Getting More Human

CSE Tue 10/09. Nadir Weibel

Touch & Gesture. HCID 520 User Interface Software & Technology

Perception in Immersive Virtual Reality Environments ROB ALLISON DEPT. OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE YORK UNIVERSITY, TORONTO

Applying Vision to Intelligent Human-Computer Interaction

VICs: A Modular Vision-Based HCI Framework

Virtual Environments. Ruth Aylett

Outline. Paradigms for interaction. Introduction. Chapter 5 : Paradigms. Introduction Paradigms for interaction (15)

Journal Title ISSN 5. MIS QUARTERLY BRIEFINGS IN BIOINFORMATICS

SIGVerse - A Simulation Platform for Human-Robot Interaction Jeffrey Too Chuan TAN and Tetsunari INAMURA National Institute of Informatics, Japan The

D S R G. Alina Mashko, GUI universal and global design. Department of vehicle technology. Faculty of Transportation Sciences

Chapter 2 Understanding and Conceptualizing Interaction. Anna Loparev Intro HCI University of Rochester 01/29/2013. Problem space

Stereo-based Hand Gesture Tracking and Recognition in Immersive Stereoscopic Displays. Habib Abi-Rached Thursday 17 February 2005.

Computer Haptics and Applications

AI Application Processing Requirements

Multi-modal Human-computer Interaction

INTUITION Integrated Research Roadmap

Tangible User Interfaces

MIN-Fakultät Fachbereich Informatik. Universität Hamburg. Socially interactive robots. Christine Upadek. 29 November Christine Upadek 1

preface Motivation Figure 1. Reality-virtuality continuum (Milgram & Kishino, 1994) Mixed.Reality Augmented. Virtuality Real...

ISCW 2001 Tutorial. An Introduction to Augmented Reality

Input-output channels

An Example Cognitive Architecture: EPIC

Visual information is clearly important as people IN THE INTERFACE

Enhancing Shipboard Maintenance with Augmented Reality

- Basics of informatics - Computer network - Software engineering - Intelligent media processing - Human interface. Professor. Professor.

CSC 2524, Fall 2017 AR/VR Interaction Interface

how many digital displays have rconneyou seen today?

Available theses in robotics (March 2018) Prof. Paolo Rocco Prof. Andrea Maria Zanchettin

Multi-Modal User Interaction. Lecture 3: Eye Tracking and Applications

LECTURE 5 COMPUTER PERIPHERALS INTERACTION MODELS

Human Factors. We take a closer look at the human factors that affect how people interact with computers and software:

BODILY NON-VERBAL INTERACTION WITH VIRTUAL CHARACTERS

Interactive Virtual Environments

Transcription:

Perceptual Interfaces Adapted from Matthew Turk s (UCSB) and George G. Robertson s (Microsoft Research) slides on perceptual p interfaces

Outline Why Perceptual Interfaces? Multimodal interfaces Vision Based Interfaces (VBI) Examples

Observation Moore s Law has driven computer technology for decades Exponential improvement in HW 5 years ~ 10x improvement 10 years ~ 100x improvement Progress 20 years ~ 10,000x improvement Time But u there eehas sbeen no Moore s ooeslaw for user interfaces! The result?

The result Progress HW SW Curse of fthe delta! Time Computing Capacity Another view: There s no Moore s Law for people! Human Capacity Time

Curse of fthe delta

Evolution of user interfaces When Implementation Paradigm 1950s Switches, punched cards None 1970s Command-line interface Typewriter 1980s Graphical UI (GUI) Desktop 2000s??????

Current UI Limitations Failure to use Human Abilities Limited Vision (Flat, 2D) No Speech No Gestures Limited Audio One Hand Tied Behind Back Limited Tactile

The Next Big Thing in UI? Immersive environments Wearable computers, Virtual Reality, Augmented Reality Ubiquitous Computing Invisible, pervasive Tangible UI Coupling of physical objects and digital data Multimodal UI Sound, speech, gesture Affective Computing Computers that understand and express emotion

Evolution of user interfaces When Implementation Paradigm 1950s Switches, punched cards None 1970s Command-line interface Typewriter 1980s Graphical UI (GUI) Desktop 2000s Perceptual??? UI (PUI) Natural??? interaction

Perceptual linterfaces Highly interactive, ti multimodal linterfaces modeled after natural human-to-human interaction ti Goal: For people to be able to interact with computers in a similar fashion to how they interact with each other and with the physical py world Not just passive Multiple modalities, not just mouse, keyboard, monitor

Perceptual User Interfaces Perceptive human-like perceptual capabilities i (what is the user saying, who is the user, where is the user, what is he doing?) Multimodal People use multiple modalities to communicate (speech, gestures, facial expressions, ) Multimedia Text, graphics, audio and video

Perception In order to respond appropriately, objects/room need(s) to pay attention to People and Context Machines have to be aware of their environment: Who, What, When, Where and Why? Interfaces must be adaptive to Overall situation Idiid Individual luser

How Do The Pieces Fit? Multimodal Input Multimodal Output Perceptive UI Multimedia Perceptual UI

Perceptual luser Interfaces (PUI) Special section on PUIs in the March 2000 issues of Communications of the ACM, edited by Matthew Turk and George Robertson. PUIs combine natural human capabilities of communication, motor, cognitive, and perceptual skills with computer I/O devices, machine perception, p and reasoning. Integrate research results from different disciplines vision, speech, graphics and visualization, user modeling, haptics, and cognitive psychology

Nt Natural lhuman interaction it ti sight touch sound Sensing/perception Cognitive skills Social skills Social conventions Shared knowledge Adaptation taste (?) smell (?)

Perceptual linterface vision user modeling learning speech haptics graphics Sensing/perception Cognitive skills Social skills Social conventions Shared knowledge Adaptation taste (?) smell (?)

What are Multimodal l Interfaces? Attempts to use human communication skills Provide user with multiple modalities May be simultaneous or not Fusion vs. Temporal Constraints Multiple styles of interaction

Early example PutThatThere There (Bolt 1980) Speech and gestures used simultaneously

Why Multimodal l Interfaces? Today s interfaces fall far short of human capabilities i Higher bandwidth is possible Different modalities excel at different tasks Errors and disfluencies reduced Multimodal interfaces are more engaging Users perceived multiple things at once User do multiple things at once

Motivation: Why PUIs? Many reasons, including: The glorified typewriter GUI model is too weak, too constraining, for the ways we will use computers in the future One size doesn t fit all diverse HCI requirements from small mobile devices to larger powerful embedded devices. Transfer of natural, social skills easy to learn Simplicity: simple = natural, adaptive Technology is coming: no longer deaf, dumb, and blind To enable both control and awareness

How could we do this? Develop and it integratet various relevant ttechnologies, such as: Speech recognition Speech synthesis Natural language processing Vision (recognition and tracking) Graphics, animation, visualization Haptic I/O Affective computing Tangible interfaces Sound recognition Sound generation User modeling Conversational interfaces

Dt Detecting ti gesture

Bi Being aware of fthe user

Nt Natural navigation

There are many issues! What are the appropriate and most useful input/output modalities? (vision, speech, haptic, taste, smell?) Is the event-based model appropriate? What is a perceptual event? Is there a useful, reliable subset? Non-deterministic i i events Future progress (expanding the event set) Allocation of resources Multiple goal management Training, calibration Quality and control of sensors Environment restrictions Pi Privacy

Issues (cont.) t) On the Internet, nobody knows you re a dog. New Yorker, 5-Jul-1993, p. 61

Some PUI objections Arguments against intelligent, t adaptive, agent-based, and anthropomorphic interfaces HCI should be characterized by: Direct manipulation Predictable interactions Giving responsibility and a sense of accomplishment to users Won t work AI hard Is 50% of HAL good enough?

Two major obstacles Technology (the easy one) Lots of researchers worldwide Increasing interest Consistent progress The Marketplace (the hard one) But there s growing convergence: hw/sw advances, commercial interest in biometrics, accessibility, recognition technologies, virtual reality, entertainment.

but still... not quite there yet... versus

Vision i Based Interfaces (VBI) Visual cues are important t in communication! Useful visual cues Presence Location Identity y( (and age, sex, nationality, etc.) Facial expression Body language Attention (gaze direction) Gestures for control and communication Lip movement Activity VBI using computer vision to perceive these cues

Elements of VBI Hand dtracking Hand gestures Arm gestures Head tracking Gaze tracking Lip reading Face recognition Facial expression Body tracking Activity analysis

Some VBI application areas Accessibility, hands-free computing Game input Social interfaces Teleconferencing Improved speech recognition (speechreading) User-aware applications Intelligent environments Biometrics Movement analysis (medicine, sports)

MIT Media Lab 1990s

Perceptual lwindow Hand and mouse form the dominant stream Head is used as nondominant stream Better than eye tracking Fixation and saccades

KidsRoom (Bobick et al 2000)

The technology Tracking faces tracking the whole face, lips, gaze, or focus of attention Tracking bodies person tracking Combining audio info with lip tracking info

Tracking of Human Faces A face provides different functions: identification perception of emotional expressions Tracking of faces: lip-reading eye/gaze tracking facial action analysis / synthesis

Color Based Face Tracking Human skin-colors: cluster in a small area of a color space skin-colors of different people mainly differ in intensity! variance can be reduced by color normalization distribution can be characterized by a Gaussian model Chromatic colors: r = R R + G + B g = G R + G + B

Color Model Advantages: very fast orientation invariant stable object representation not person-dependent d model parameters can be quickly adapted Disadvantages: environment dependent (light-sources heavily affect color distribution)

Tracking Gaze and Focus of Attention ti In meetings: to determine the addressee of a speech act to track the participants attention to analyze, who was in the center of focus for meeting indexing / retrieval Interactive rooms to guide the environments focus to the right application to suppress unwanted responses Virtual collaborative workspaces (CSCW) Human-Robot Cooperation Cars (Driver monitoring)

Head Pose Estimation Model-based approaches: Locate and track a number of facial features Compute head pose from 2D to 3D correspondences (Gee & Cipolla '94, Stiefelhagen et.al '96, Jebara & Pentland '97,Toyama '98) E l b d h Example-based approaches: estimate new pose with function approximator use face database to encode images (Pentland et.al. '94)

Model-based Head Pose estimation Find correspondences between points in a 3D model and points in the image Iteratively ti solve linear equation system to find pose parameters (r x, r y, r z, t x, t y, t z ) Feature Tracking Pose Estimation Y Z Image 3D Model Real World X

Head tracking demo

Person Tracking Vision i based localization li of people/objects: Single Perspective: Multiple Perspective:

More examples Some applications from UCSB Four Eyes lab 4 I s: Imaging, Interaction, and Innovative Interfaces Research in computer vision and human-computer interaction Vision based and multimodal interfaces Augmented reality and virtual environments Multimodal biometrics Wearable and mobile computing 3D graphics.

1C 1. Coarse face direction Problem: Coarsely track multiple, l possibly lowresolution face images in a scene Goal: Capture group behavior (attention); real-time Estimate the Focus of Intention (attention + semantics) Action understanding Meeting annotation Audience feedback Videoconferencing Etc.

Coarse face direction (cont.) t) Strategy: t Fast color-based skin tracking Simple feature location Non-skin areas Simple statistics Look for correlation with head direction (relative to camera) f (statistical measures) = direction

Example results

2F 2. Facial ilexpression analysis Facial expression representation tti and visualization Use non-linear manifolds to represent dynamic facial expressions Intuition: The images of all facial expressions by a person makes a The images of all facial expressions by a person makes a smooth manifold in (high-dimensional) image space, with the neutral face as the central reference point.

3. Hand detection, ti tracking, and recognition Robust single-view detection View-dependent posture recognition

Hand tracking demo

4. Recognizing i body gestures and activity it Current: Real-time tracking for Interactive digital art applications i Autonomous aircraft on carrier flight deck Restricted EM algorithm for skin classification Head and hand/arm tracking