Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Similar documents
Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Confidence-Based Multi-Robot Learning from Demonstration

Learning and Using Models of Kicking Motions for Legged Robots

RoboPatriots: George Mason University 2014 RoboCup Team

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Statement May, 2014 TUCKER BALCH, ASSOCIATE PROFESSOR SCHOOL OF INTERACTIVE COMPUTING, COLLEGE OF COMPUTING GEORGIA INSTITUTE OF TECHNOLOGY

Demonstration-Based Behavior and Task Learning

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017

Teaching a Humanoid: A User Study with HOAP-3 on Learning by Demonstration

CS295-1 Final Project : AIBO

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Research Statement MAXIM LIKHACHEV

Task-Based Dialog Interactions of the CoBot Service Robots

Spring 19 Planning Techniques for Robotics Introduction; What is Planning for Robotics?

Path Planning in Dynamic Environments Using Time Warps. S. Farzan and G. N. DeSouza

Overview Agents, environments, typical components

Learning and Using Models of Kicking Motions for Legged Robots

Real-time Adaptive Robot Motion Planning in Unknown and Unpredictable Environments

[31] S. Koenig, C. Tovey, and W. Halliburton. Greedy mapping of terrain.

Robots Learning from Robots: A proof of Concept Study for Co-Manipulation Tasks. Luka Peternel and Arash Ajoudani Presented by Halishia Chugani

CSE 473 Artificial Intelligence (AI)

Linking Perception and Action in a Control Architecture for Human-Robot Domains

Transactions on Information and Communications Technologies vol 6, 1994 WIT Press, ISSN

Effects of Integrated Intent Recognition and Communication on Human-Robot Collaboration

A Reactive Robot Architecture with Planning on Demand

Michael P. Vitus 260 King St Unit 757

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Tutorial of Reinforcement: A Special Focus on Q-Learning

CSE 473 Artificial Intelligence (AI) Outline

CS686: Robot Motion Planning and Applications

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball

Unit 12: Artificial Intelligence CS 101, Fall 2018

Multi-Agent Planning

Using Reactive and Adaptive Behaviors to Play Soccer

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

S.P.Q.R. Legged Team Report from RoboCup 2003

COMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION

Changjiang Yang. Computer Vision, Pattern Recognition, Machine Learning, Robotics, and Scientific Computing.

Human-Robot Co-Creativity: Task Transfer on a Spectrum of Similarity

Keywords: Multi-robot adversarial environments, real-time autonomous robots

Autonomous Robot Soccer Teams

Task Learning Through Imitation and Human-Robot Interaction

H2020 RIA COMANOID H2020-RIA

Anticipative Interaction Primitives for Human-Robot Collaboration

Team Playing Behavior in Robot Soccer: A Case-Based Reasoning Approach

Multi-Robot Team Response to a Multi-Robot Opponent Team

Advanced Techniques for Mobile Robotics Location-Based Activity Recognition

FAST GOAL NAVIGATION WITH OBSTACLE AVOIDANCE USING A DYNAMIC LOCAL VISUAL MODEL

Fall 17 Planning & Decision-making in Robotics Introduction; What is Planning, Role of Planning in Robots

CURRICULUM VITAE. Evan Drumwright EDUCATION PROFESSIONAL PUBLICATIONS

Subsumption Architecture in Swarm Robotics. Cuong Nguyen Viet 16/11/2015

CS686: High-level Motion/Path Planning Applications

Multi-Robot Planning using Robot-Dependent Reachability Maps

Information and Program

Physical Human Robot Interaction

Q Learning Behavior on Autonomous Navigation of Physical Robot

A Taxonomy of Multirobot Systems

BRENNA D. ARGALL

Reinforcement Learning for Ethical Decision Making

Human-Swarm Interaction

An Agent-based Heterogeneous UAV Simulator Design

Stabilize humanoid robot teleoperated by a RGB-D sensor

COOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS

Safe and Efficient Autonomous Navigation in the Presence of Humans at Control Level

Physics-Based Manipulation in Human Environments

AUTOMATIC RECOVERY FROM SOFTWARE FAILURE

LASA I PRESS KIT lasa.epfl.ch I EPFL-STI-IMT-LASA Station 9 I CH 1015, Lausanne, Switzerland

Multi-Platform Soccer Robot Development System

Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping

Robotics for Children

Gameplay as On-Line Mediation Search

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Game Artificial Intelligence ( CS 4731/7632 )

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Vishnu Nath. Usage of computer vision and humanoid robotics to create autonomous robots. (Ximea Currera RL04C Camera Kit)

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts

A HYBRID CBR-NEURAL ADAPTATION ALGORITHM FOR HUMANOID ROBOT CONTROL BASED ON KALMAN BALL TRACKING

Reinforcement Learning Simulations and Robotics

Capturing and Adapting Traces for Character Control in Computer Role Playing Games

Wednesday, October 29, :00-04:00pm EB: 3546D. TELEOPERATION OF MOBILE MANIPULATORS By Yunyi Jia Advisor: Prof.

Using Artificial intelligent to solve the game of 2048

CSE 573: Artificial Intelligence Autumn 2010

Feature Selection for Activity Recognition in Multi-Robot Domains

Gilbert Peterson and Diane J. Cook University of Texas at Arlington Box 19015, Arlington, TX

Mini Project #2: Motion Planning and Generation for a Robot Arm

An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots

Multi-Humanoid World Modeling in Standard Platform Robot Soccer

Action-Based Sensor Space Categorization for Robot Learning

Evolutionary robotics Jørgen Nordmoen

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

CS 188: Artificial Intelligence

CS594, Section 30682:

A Probabilistic Method for Planning Collision-free Trajectories of Multiple Mobile Robots

SnakeSIM: a Snake Robot Simulation Framework for Perception-Driven Obstacle-Aided Locomotion

Can a social robot train itself just by observing human interactions?

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Learning Attentive-Depth Switching while Interacting with an Agent

RoboCup. Presented by Shane Murphy April 24, 2003

Swarm Robotics. Clustering and Sorting

Transcription:

Jane Li Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

(2 pts) How to avoid obstacles when reproducing a trajectory using a learned DMP? (2 pts) How to synchronize the motions reproduced using DMP across multiple DOFs? (3 pts) Describe the phase matching problem in human-robot interaction learning (3 pts) Compared to DTW and GMM/GMR, what are the advantages of ProMP? RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 2

Observes human partner s motion Sparse Predict end user s motion Prediction may vary by fitting sparse data to variants of a model that differ by temporal scaling Generate motions that match Wrong prediction leads to mismatch between human and robot motions RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 3

Dynamic time warping (DTW) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 4

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 5

Represent averaged behavior? No. Pairwise matching How to represent variability? No. Pairwise matching How to align trajectory? Yes. Align fast and slow trajectory (deterministic, need to choose a unique reference) How to match phase in the learning of human-robot interaction? Yes. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 6

Represent averaged behavior? Yes How to represent variability? Yes How to align trajectory? No. Cannot align fast or slow motions How to match phase in the learning of human-robot interaction? No. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 7

Represent averaged behavior? Yes How to represent variability? Yes How to align trajectory? Yes. Align motions by phase matching How to match phase in the learning of human-robot interaction? Yes. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 8

Heramb Nemlekar (selected for presentation) Gunner Hover Aishwary Jagetia Sanjuksha Nirgude (selected for presentation) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 10

Heramb Nemlekar Tess Meier (selected for presentation) Sihui Li Aishwary Jagetia (selected for presentation) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 11

Student presentation Spotlight talk Each one gives a 5-min talk Interactive session for Q&A after all the presentation Publish good examples of presentation and review on canvas Grade boosting for presentation You can choose a low-grade assignment/quiz to replace with full score Let our TA know which you prefer to replace, by Wednesday this week RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 12

How actions derived from low-level learning can be used to learn higher level tasks

What can be learned at high level? State-action mapping function, i.e., policy Task plan, objectives, features Reference frame, affordance How to learn? Supervised/unsupervised learning Reinforcement learning RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 15

Action Primitives = MP or sequence of MP E.g., Reach-To, Pick-Up, Move-Forward, etc. Can be hand-coded, developed using planner, learned from demos Often parameterized RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 16

Explicit task goals Pre- and post-condition of action primitives A particular configuration of an object Implicit task goals = Reward function Sparse/dense reward Learn reward function? IOC, IRL RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 17

Input state output action Demonstration = state-action pairs Learning policy Objectives underlying the policy? don t care RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 19

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 20

Training process Learn state transition function as hierarchical classifier over features Next State State Transition Function Current State Features RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 21

Goal is out of view RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 22

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 23

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 24

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 25

Lazy learning algorithms Memorize demonstrated action in a robot state In a new states, search for similar old states and apply the corresponding action Used for learning navigation from demonstrations Method for measuring similarity? KNN [4] Case-based reasoning [5,6] RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 26

Observe and memorize a sequence of states (sub-goals) Pick up the action that maximizes the chance of taking the agent from current state to the memorized next state RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 27

Essentially, a classification/regression problem Estimate classification confidence Integrate a measure of confidence in classification/regression Address the uncertainty in action Methods for estimating classification confidence Bayesian methods [8] Confidence-Based Autonomy algorithm [9] Locally Weighted Projection Regression [10] RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 28

Represent desired robot behavior as a plan Generalized state-action mapping State: Pre-condition, post-condition Action: Sequence of action between initial and goal states Underlying objectives = goal state RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 30

Bayesian models Finite-state Automaton RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 31

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 32

More sparse states Initial states Goal states What if something goes wrong in-between? Provide additional demonstration as correction Iterative and incremental learning RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 33

Re-do parts of the demo Re-segment old data Add new corrections and rebuild FSA RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 34

Read Section 5.5 Learning task features Prepare 7-10 presentation slides Digest over multiple papers To reflect your understanding Add notes to your presentation slides, or Submit 2-page review RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 35

[1] Sullivan, Keith, and Sean Luke. "Hierarchical multi-robot learning from demonstration." Proceedings of the Robotics: Science and Systems Conference. 2011. [2] Sullivan, Keith, Sean Luke, and Vittoria Amos Ziparo. "Hierarchical learning from demonstration on humanoid robots." Proceedings of Humanoid Robots Learning from Human Interaction Workshop. Vol. 38. 2010 RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 36

[4] Saunders, J., Nehaniv, C. L., & Dautenhahn, K. (2006, March). Teaching robots by moulding behavior and scaffolding the environment. In Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction (pp. 118-125). [5] Likhachev, M., & Arkin, R. C. (2001). Spatio-temporal case-based reasoning for behavioral selection. In Robotics and Automation, 2001. Proceedings 2001 ICRA. IEEE International Conference on (Vol. 2, pp. 1627-1634). [6] Ros, R., De Màntaras, R. L., Arcos, J. L., & Veloso, M. (2007, August). Team playing behavior in robot soccer: A case-based reasoning approach. In ICCBR (Vol. 2007, pp. 46-60) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 37

[7] Rao, Rajesh PN, Aaron P. Shon, and Andrew N. Meltzoff. "11 A Bayesian model of imitation in infants and robots." (2007). [8] Lockerd, A., & Breazeal, C. (2004, September). Tutelage and socially guided robot learning. In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on (Vol. 4, pp. 3475-3480). [9] Chernova, Sonia, and Manuela Veloso. "Interactive policy learning through confidence-based autonomy." Journal of Artificial Intelligence Research 34.1 (2009): 1 [10] Grollman, Daniel H., and Odest Chadwicke Jenkins. "Dogged learning for robots." Robotics and Automation, 2007 IEEE International Conference on. IEEE, 2007 RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 38