Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Jane Li Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

(2 pts) How to avoid obstacles when reproducing a trajectory using a learned DMP? (2 pts) How to synchronize the motions reproduced using DMP across multiple DOFs? (3 pts) Describe the phase matching problem in human-robot interaction learning (3 pts) Compared to DTW and GMM/GMR, what are the advantages of ProMP? RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 2

Observes human partner s motion Sparse Predict end user s motion Prediction may vary by fitting sparse data to variants of a model that differ by temporal scaling Generate motions that match Wrong prediction leads to mismatch between human and robot motions RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 3

Dynamic time warping (DTW) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 4

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 5

Represent averaged behavior? No. Pairwise matching How to represent variability? No. Pairwise matching How to align trajectory? Yes. Align fast and slow trajectory (deterministic, need to choose a unique reference) How to match phase in the learning of human-robot interaction? Yes. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 6

Represent averaged behavior? Yes How to represent variability? Yes How to align trajectory? No. Cannot align fast or slow motions How to match phase in the learning of human-robot interaction? No. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 7

Represent averaged behavior? Yes How to represent variability? Yes How to align trajectory? Yes. Align motions by phase matching How to match phase in the learning of human-robot interaction? Yes. RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 8

Heramb Nemlekar (selected for presentation) Gunner Hover Aishwary Jagetia Sanjuksha Nirgude (selected for presentation) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 10

Heramb Nemlekar Tess Meier (selected for presentation) Sihui Li Aishwary Jagetia (selected for presentation) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 11

Student presentation Spotlight talk Each one gives a 5-min talk Interactive session for Q&A after all the presentation Publish good examples of presentation and review on canvas Grade boosting for presentation You can choose a low-grade assignment/quiz to replace with full score Let our TA know which you prefer to replace, by Wednesday this week RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 12

How actions derived from low-level learning can be used to learn higher level tasks

What can be learned at high level? State-action mapping function, i.e., policy Task plan, objectives, features Reference frame, affordance How to learn? Supervised/unsupervised learning Reinforcement learning RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 15

Action Primitives = MP or sequence of MP E.g., Reach-To, Pick-Up, Move-Forward, etc. Can be hand-coded, developed using planner, learned from demos Often parameterized RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 16

Explicit task goals Pre- and post-condition of action primitives A particular configuration of an object Implicit task goals = Reward function Sparse/dense reward Learn reward function? IOC, IRL RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 17

Input state output action Demonstration = state-action pairs Learning policy Objectives underlying the policy? don t care RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 19

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 20

Training process Learn state transition function as hierarchical classifier over features Next State State Transition Function Current State Features RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 21

Goal is out of view RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 22

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 23

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 24

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 25

Lazy learning algorithms Memorize demonstrated action in a robot state In a new states, search for similar old states and apply the corresponding action Used for learning navigation from demonstrations Method for measuring similarity? KNN [4] Case-based reasoning [5,6] RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 26

Observe and memorize a sequence of states (sub-goals) Pick up the action that maximizes the chance of taking the agent from current state to the memorized next state RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 27

Essentially, a classification/regression problem Estimate classification confidence Integrate a measure of confidence in classification/regression Address the uncertainty in action Methods for estimating classification confidence Bayesian methods [8] Confidence-Based Autonomy algorithm [9] Locally Weighted Projection Regression [10] RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 28

Represent desired robot behavior as a plan Generalized state-action mapping State: Pre-condition, post-condition Action: Sequence of action between initial and goal states Underlying objectives = goal state RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 30

Bayesian models Finite-state Automaton RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 31

RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 32

More sparse states Initial states Goal states What if something goes wrong in-between? Provide additional demonstration as correction Iterative and incremental learning RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 33

Re-do parts of the demo Re-segment old data Add new corrections and rebuild FSA RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 34

Read Section 5.5 Learning task features Prepare 7-10 presentation slides Digest over multiple papers To reflect your understanding Add notes to your presentation slides, or Submit 2-page review RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 35

[1] Sullivan, Keith, and Sean Luke. "Hierarchical multi-robot learning from demonstration." Proceedings of the Robotics: Science and Systems Conference. 2011. [2] Sullivan, Keith, Sean Luke, and Vittoria Amos Ziparo. "Hierarchical learning from demonstration on humanoid robots." Proceedings of Humanoid Robots Learning from Human Interaction Workshop. Vol. 38. 2010 RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 36

[4] Saunders, J., Nehaniv, C. L., & Dautenhahn, K. (2006, March). Teaching robots by moulding behavior and scaffolding the environment. In Proceedings of the 1st ACM SIGCHI/SIGART conference on Human-robot interaction (pp. 118-125). [5] Likhachev, M., & Arkin, R. C. (2001). Spatio-temporal case-based reasoning for behavioral selection. In Robotics and Automation, 2001. Proceedings 2001 ICRA. IEEE International Conference on (Vol. 2, pp. 1627-1634). [6] Ros, R., De Màntaras, R. L., Arcos, J. L., & Veloso, M. (2007, August). Team playing behavior in robot soccer: A case-based reasoning approach. In ICCBR (Vol. 2007, pp. 46-60) RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 37

[7] Rao, Rajesh PN, Aaron P. Shon, and Andrew N. Meltzoff. "11 A Bayesian model of imitation in infants and robots." (2007). [8] Lockerd, A., & Breazeal, C. (2004, September). Tutelage and socially guided robot learning. In Intelligent Robots and Systems, 2004.(IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on (Vol. 4, pp. 3475-3480). [9] Chernova, Sonia, and Manuela Veloso. "Interactive policy learning through confidence-based autonomy." Journal of Artificial Intelligence Research 34.1 (2009): 1 [10] Grollman, Daniel H., and Odest Chadwicke Jenkins. "Dogged learning for robots." Robotics and Automation, 2007 IEEE International Conference on. IEEE, 2007 RBE 595 Synergy of Human and Robotic Systems Instructor: Jane Li, Mechanical Engineering Department & Robotic Engineering Program - WPI 11/28/2017 38