Telling What-Is-What in Video. Gerard Medioni

Similar documents
Face detection, face alignment, and face image parsing

An Un-awarely Collected Real World Face Database: The ISL-Door Face Database

IMAGES OF MOVING SUBJECTS

Vehicle Detection, Tracking and Counting Objects For Traffic Surveillance System Using Raspberry-Pi

Computer Vision Slides curtesy of Professor Gregory Dudek

3D Interaction using Hand Motion Tracking. Srinath Sridhar Antti Oulasvirta

FOCAL LENGTH CHANGE COMPENSATION FOR MONOCULAR SLAM

Motion perception PSY 310 Greg Francis. Lecture 24. Aperture problem

Real Time Video Analysis using Smart Phone Camera for Stroboscopic Image

A Vehicular Visual Tracking System Incorporating Global Positioning System

Lecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2014

Lenses, exposure, and (de)focus

OPPORTUNISTIC TRAFFIC SENSING USING EXISTING VIDEO SOURCES (PHASE II)

Real-Time Face Detection and Tracking for High Resolution Smart Camera System

Today I t n d ro ucti tion to computer vision Course overview Course requirements

Title Goes Here Algorithms for Biometric Authentication

Today. CS 395T Visual Recognition. Course content. Administration. Expectations. Paper reviews

Moving Object Detection for Intelligent Visual Surveillance

Photographing Long Scenes with Multiviewpoint

High Level Computer Vision. Introduction - April 16, Bernt Schiele & Mario Fritz MPI Informatics and Saarland University, Saarbrücken, Germany

Intro to Digital Compositions: Week One Physical Design

Face Detection System on Ada boost Algorithm Using Haar Classifiers

Pose Invariant Face Recognition

Image Analysis & Searching

A Vehicular Visual Tracking System Incorporating Global Positioning System

A Vehicular Visual Tracking System Incorporating Global Positioning System

Insight VCS: Maya User s Guide

SIS63-Building the Future-Advanced Integrated Safety Applications: interactive Perception platform and fusion modules results

Contents Preface Micro-Doppler Signatures Review, Challenges, and Perspectives Phenomenology of Radar Micro-Doppler Signatures

Lecture 19: Depth Cameras. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Table of Contents. 1.Choosing your Camera. 2. Understanding your Camera Which Camera DSLR vs Compact...8

What was the first gestural interface?

Image stitching. Image stitching. Video summarization. Applications of image stitching. Stitching = alignment + blending. geometrical registration

TAKING GREAT PICTURES. A Modest Introduction

Computer Vision. The Pinhole Camera Model

Unit 1: Image Formation

Optical image stabilization (IS)

Perception. Introduction to HRI Simmons & Nourbakhsh Spring 2015

Tips for a correct functioning of Face Recognition technology. FacePhi Face Recognition.

Chapter 6 Face Recognition at a Distance: System Issues

Optical image stabilization (IS)

Lifelog-Style Experience Recording and Analysis for Group Activities

6.098 Digital and Computational Photography Advanced Computational Photography. Bill Freeman Frédo Durand MIT - EECS

SUPER RESOLUTION INTRODUCTION

Object Tracking Toolbox

Interacting within Virtual Worlds (based on talks by Greg Welch and Mark Mine)

Distinguishing Mislabeled Data from Correctly Labeled Data in Classifier Design

Image Formation. Dr. Gerhard Roth. COMP 4102A Winter 2015 Version 3

Computational Camera & Photography: Coded Imaging

Driver Assistance for "Keeping Hands on the Wheel and Eyes on the Road"

Computer Vision in Human-Computer Interaction

Practical Image and Video Processing Using MATLAB

Facial Biometric For Performance. Best Practice Guide

Cooperative Tracking with Mobile Robots and Networked Embedded Sensors

ALMALENCE SUPER SENSOR. A software component with an effect of increasing the pixel size and number of pixels in the sensor

Perception platform and fusion modules results. Angelos Amditis - ICCS and Lali Ghosh - DEL interactive final event

EFFICIENT ATTENDANCE MANAGEMENT SYSTEM USING FACE DETECTION AND RECOGNITION

Recognizing Words in Scenes with a Head-Mounted Eye-Tracker

Main Subject Detection of Image by Cropping Specific Sharp Area

Aimetis Outdoor Object Tracker. 2.0 User Guide

Ayonix-APS. World s fastest 3D Face surveillance application. Feb.13 th, 2017

Pilot: Device-free Indoor Localization Using Channel State Information

6.A44 Computational Photography

Computational Approaches to Cameras

Spring 2018 CS543 / ECE549 Computer Vision. Course webpage URL:

Background Pixel Classification for Motion Detection in Video Image Sequences

A Survey on Different Face Detection Algorithms in Image Processing

Autonomous Face Recognition

Toward an Augmented Reality System for Violin Learning Support

Automatic correction of timestamp and location information in digital images

CONTENTS INTRODUCTION ACTIVATING VCA LICENSE CONFIGURATION...

VIDEO DATABASE FOR FACE RECOGNITION

COMP 776: Computer Vision

Homographies and Mosaics

Intelligent Traffic Sign Detector: Adaptive Learning Based on Online Gathering of Training Samples

Lecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2015

TAKING GREAT PICTURES. A Modest Introduction

Gesture Recognition with Real World Environment using Kinect: A Review

ULISSE COMPACT THERMAL

LabVIEW based Intelligent Frontal & Non- Frontal Face Recognition System

Input devices and interaction. Ruth Aylett

Immersive Authoring of Tangible Augmented Reality Applications

Homographies and Mosaics

Image stabilization (IS)

September CoroCAM 6D. Camera Operation Training. Copyright 2012

Interactive Motion Analysis for Video Surveillance and Long Term Scene Monitoring

Real-Time Cooperative Multi-Target Tracking by Communicating Active Vision Agents

Research Seminar. Stefano CARRINO fr.ch

Visual Search using Principal Component Analysis

Privacy Preserving Optics for Miniature Vision Sensors

Image Processing Based Vehicle Detection And Tracking System

Real-Time Tracking via On-line Boosting Helmut Grabner, Michael Grabner, Horst Bischof

Bandit Detection using Color Detection Method

'Smart' cameras are watching you

The Basics. Introducing PaintShop Pro X4 CHAPTER 1. What s Covered in this Chapter

Lecture 1 Introduction to Computer Vision. Lin ZHANG, PhD School of Software Engineering, Tongji University Spring 2018

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES

Close-Range Photogrammetry for Accident Reconstruction Measurements

gfm-app.com User Manual

Deblurring. Basics, Problem definition and variants

Transcription:

Telling What-Is-What in Video Gerard Medioni medioni@usc.edu 1

Tracking Essential problem Establishes correspondences between elements in successive frames Basic problem easy 2

Many issues One target (pursuit) vs. A few objects vs. Lots of objects 3

More issues: motion type Rigid Articulated Non rigid (face expression) 4

Tag & Track - The problem Select any object and follow it in real time Object tracking problem Current work 5

Challenges Unknown type of object Changes in viewpoint Changes in lighting Cluttered background Running time vs 6

Context Tracker Motivation Context information is overlooked: online processing requirement, speed trade-off + Focus in building appearance model, do not take advantage of background info Requires very complicated Explore model Distracters when similar and objects pay appear. + Treat every region on the background in the same way. more attention to them 7

Context Tracker Motivation What else to explore? Supporters! 8

Context Tracker New input image Short-term tracking Detection Detector Tracking loop Online model evaluation distance... 9

Context Tracker Distracter Detection: o Pass the classifier (share the same classifier) o High confidence (look similar to our object) Tracking: o Same as tracking our target BUT will be killed when being lost or look different from our target o Heuristic data association: the higher confidence has higher priority in the association queue 10

Context Tracker Experiment settings 8 ferns and 4 6bitBP features Minimum search region 20x20 Number of maximum distracters 15, maximum supporters 40 System: 3.0 GHz (one core), 8GB Memory Runs 10-25 fps depending on the number of distracters and supporters 11

12

13

Active Surveillance Combine Real Time tracker and Camera Control To keep object of interest in the field of view of the camera To zoom in (on the face) 14

Unknown type of object Challenges Changes in viewpoint Changes in lighting Tracking Cluttered background Running time vs Limited support from commercial cameras with discrete speed control due to the use of stepping motors. Delay because of communication through TCP/IP Network abrupt motion and motion blur 15 Control

Unknown type of object Challenges Changes in viewpoint Changes in lighting Tracking Cluttered background Running time Limited support from commercial cameras with discrete speed control due to the use of stepping motors. Delay because of communication through TCP/IP Network abrupt motion and motion blur 16 Control

Practical issues Challenges Pedestrians far away (face covers few pixels) 100% crop In long focal length, people may get out of FOV with a little movement. 17

Overview Tracking control loop Pedestrian detector Camera control Face detector Camera control Tracker No Face Tracked? Yes Tagged high resolution face sequences 18

Experimental setup Settings Sony PTZ Network Camera SNC-RZ30N with wireless card 14 levels of speed control for panning and 18 levels for tilting 25x optical zoom, 300x digital zoom Pan angle: -170 to +170 degrees Tilt angle: -90 to +25 degrees 19

Results 20

Tracking from security PTZ Camera @ USC Cannot see the face from 100% cropped image Pedestrian detector Zooming (11x) Tracking Face track Frontal face detector 21

Tracking many objects Useful for persistent surveillance WAAS (Wide Area Aerial Surveillance) Very large images (60MPix-1GPix) 2 frames per second 22

Video Stabilization 23

Video Stabilization Results Close Up 24

Tracking Motivation Moving objects tell us a lot about the life in the geographic area Important for activity recognition Challenges Small number of pixels on target Large number of targets 25

Approach Goal: infer tracklets, each representing one object, over a sliding window of frames 4-8 second window (depends on frame rate) Input: object detections (from background subtraction or otherwise) 26

Results (CLIF 2006) 27

Tracking Results (CLIF 2006) Object Detection Rate False Alarm Rate Normalized Track Fragmentation ID Consistency 0.72 0.04 1.01 0.84 Manually generated ground truth 168 tracks, 80 frames Low track fragmentation Low false alarm rate Efficient > 40 objects tracked at 2 fps Comparison with MCMC tracker (Yu 2009) Did not converge to a reasonable solution Requires good initialization Does not scale to our domain 28

Tracking VERY MANY Objects With the development of surveillance system, we will pay more and more attention to analyzing people in crowded scenes. (Sports, political gathering, etc.) 29

Crowded Scenes Challenges Hundreds of similar objects Cluttered background Small object size Occlusions Detect-then-track method fails: appearance based detector and background modeling based motion blob detector fail 30

Tracking Using Motion Patterns for Very Crowded Scenes We solve the problem of tracking in structured crowded scenes using Motion Structure Tracker (MST) MST is a combination of visual tracking, motion pattern learning and multi-target tracking. In MST, tracking and detection are performed jointly, and motion pattern information is integrated in both steps to enforce scene structure constraint. MST is initially used to track a single target, and further extended to solve a simplified version of the multi-target tracking problem. 31

An Overview of Motion Structure Tracker Online Unsupervised Learning Motion Pattern Inference Tag Single Target Tracking Detect Similar Multi-Target Tracking Online Tracking (Detection & Tracking) First frame (Detection & Tracking) Input 32

Tag & Track Motion Structure Tracker for Single Target Tracking Results for Temporally Stationary Scenes (motion pattern do not change with time) Marathon-1 Marathon-2 Marathon-3 Sequence Method ATR ACLE Marathon-1 IVT Tracker P-N Tracker Ours 35.21% 56.16% 81.40% 62.8 35.1 6.7 Marathon-2 IVT Tracker P-N Tracker Ours 33.47% 68.60% 73.12% 86.5 56.4 28.5 Marathon-3 IVT Tracker P-N Tracker Ours 40.03% 67.16% 92.08% ATR : Average Track Ratiio ACLE: Average Center Location Error (ACLE) 64.1 33.9 4.8 33

Motion Structure Tracker for Single Target Tracking Results for Temporally Non-Stationary Scenes (motion pattern change with time) Sequence Method ATR ACLE Hongkong Motorbike Hongkong IVT Tracker P-N Tracker Ours IVT Tracker P-N Tracker Ours 27.63% 39.58% 62.31% 31.56% 47.22% 90.75% Motorbike 58.9 42.3 28.5 69.7 55.4 5.6 ATR : Average Track Ratiio ACLE: Average Center Location Error (ACLE) 34

Motion Structure Tracker for Multi-Target Tracking Once a user labels a target in the first frame, find similar objects and track all of them Ours P-N Tracker Ground Truth Frame 1 Frame 71 Frame 141 Frame 211 Ours P-N Tracker Ground Truth Frame 1 Frame 31 Frame 61 Frame 91 Examples of tracking results comparison. First row: temporally stationary scenes. Second row: temporally non-stationary scenes. 35

36

Expression Analysis Understanding facial gestures By analyzing facial motions Facial motion induces detectable appearance changes Two classes of facial motions Global, rigid head motion From head pose variation Indicate subject s attention Local, nonrigid facial deformations From facial muscle activation Indicate subject s expression 37

Overview Face Sequences Facial Deformations Head Pose Training Database Recognition and Interpretation Expressions, Facial Gestures 38

Results ( Rigid tracking, real-time) Rotation, translation, & scale Fast motion Live webcam 39

Expression Analysis 40

Summary Tracking is a multi-faceted problem Many axes of complexity Resolution Number of objects Type of motion Significant progress being achieved 41