A neuronal structure for learning by imitation. ENSEA, 6, avenue du Ponceau, F-95014, Cergy-Pontoise cedex, France. fmoga,

Similar documents
A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures

Multi-robot cognitive formations

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization

Computing with Biologically Inspired Neural Oscillators: Application to Color Image Segmentation

On-Line Learning and Planning in a Pick-and-Place Task Demonstrated Through Body Manipulation

Evolved Neurodynamics for Robot Control

Representing Robot-Environment Interactions by Dynamical Features of Neuro-Controllers

for visual know-how development Frederic Kaplan and Pierre-Yves Oudeyer Sony Computer Science Laboratory, 6 rue Amyot, Paris, France

Emergent imitative behavior on a robotic arm based on visuo-motor associative memories

Robot Learning by Demonstration using Forward Models of Schema-Based Behaviors

The Articial Evolution of Robot Control Systems. Philip Husbands and Dave Cli and Inman Harvey. University of Sussex. Brighton, UK

Geometric Neurodynamical Classifiers Applied to Breast Cancer Detection. Tijana T. Ivancevic

A COMPARISON OF ARTIFICIAL NEURAL NETWORKS AND OTHER STATISTICAL METHODS FOR ROTATING MACHINE

CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS

I. Harvey, P. Husbands, D. Cli, A. Thompson, N. Jakobi. We give an overview of evolutionary robotics research at Sussex.

Evolving Mobile Robots in Simulated and Real Environments

Multi-Platform Soccer Robot Development System

2 Study of an embarked vibro-impact system: experimental analysis

Intelligent Robot Based on Synaptic Plasticity and Neural Networks

Neuro-Fuzzy and Soft Computing: Fuzzy Sets. Chapter 1 of Neuro-Fuzzy and Soft Computing by Jang, Sun and Mizutani

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball

A Pilot Study: Introduction of Time-domain Segment to Intensity-based Perception Model of High-frequency Vibration

Robot Shaping Principles, Methods and Architectures. March 8th, Abstract

Invariant Object Recognition in the Visual System with Novel Views of 3D Objects

This is a repository copy of Complex robot training tasks through bootstrapping system identification.

Simultaneous amplitude and frequency noise analysis in Chua s circuit

Bottom-up and Top-down Perception Bottom-up perception

Hybrid architectures. IAR Lecture 6 Barbara Webb

Implicit Fitness Functions for Evolving a Drawing Robot

Key-Words: - Fuzzy Behaviour Controls, Multiple Target Tracking, Obstacle Avoidance, Ultrasonic Range Finders

COS Lecture 7 Autonomous Robot Navigation

EMERGENCE OF COMMUNICATION IN TEAMS OF EMBODIED AND SITUATED AGENTS

Chapter 9. Conclusions. 9.1 Summary Perceived distances derived from optic ow

Biologically Inspired Mobile Robot Control. Francesco Mondada Edoardo Franzi * 1. Introduction and goal of the project

Introduction to Artificial Intelligence: cs580

Real-time human control of robots for robot skill synthesis (and a bit

Intelligent Systems. Lecture 1 - Introduction

Glossary of terms. Short explanation

TED TED. τfac τpt. A intensity. B intensity A facilitation voltage Vfac. A direction voltage Vright. A output current Iout. Vfac. Vright. Vleft.

Structure and Synthesis of Robot Motion

SIMULATING RESTING CORTICAL BACKGROUND ACTIVITY WITH FILTERED NOISE. Journal of Integrative Neuroscience 7(3):

AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS

Body articulation Obstacle sensor00

Figure 1: The trajectory and its associated sensor data ow of a mobile robot Figure 2: Multi-layered-behavior architecture for sensor planning In this

Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots

Temperature Control in HVAC Application using PID and Self-Tuning Adaptive Controller

A Dynamical Systems Approach to Behavior-Based Formation Control

METER AS MECHANISM 2 often been used to describe human speech as well (Jones, 1932; Martin, 1972) even though clear empirical evidence for the appropr

Supplementary information accompanying the manuscript Biologically Inspired Modular Neural Control for a Leg-Wheel Hybrid Robot

Plan for the 2nd hour. What is AI. Acting humanly: The Turing test. EDAF70: Applied Artificial Intelligence Agents (Chapter 2 of AIMA)

Towards the development of cognitive robots

Autonomous vehicle guidance using analog VLSI neuromorphic sensors

Learning Algorithms for Servomechanism Time Suboptimal Control

A HARDWARE DC MOTOR EMULATOR VAGNER S. ROSA 1, VITOR I. GERVINI 2, SEBASTIÃO C. P. GOMES 3, SERGIO BAMPI 4

Control of a local neural network by feedforward and feedback inhibition

Touch Perception and Emotional Appraisal for a Virtual Agent

Readability of the gaze and expressions of a robot museum visitor: impact of the low level sensory-motor control

The Somatosensory System. Structure and function

CS:4420 Artificial Intelligence

Joint attention between a humanoid robot and users in imitation game

Chapter 1: Introduction to Neuro-Fuzzy (NF) and Soft Computing (SC)

A Hybrid Planning Approach for Robots in Search and Rescue

GPU Computing for Cognitive Robotics

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts

MEM380 Applied Autonomous Robots I Winter Feedback Control USARSim

Lecture 4 -- Tuesday, Sept. 19: Non-uniform injection and/or doping. Diffusion. Continuity/conservation. The five basic equations.

The Māori Marae as a structural attractor: exploring the generative, convergent and unifying dynamics within indigenous entrepreneurship

Strategies for Safety in Human Robot Interaction

PHYSICAL ROBOTS PROGRAMMING BY IMITATION USING VIRTUAL ROBOT PROTOTYPES

PERCEIVING MOVEMENT. Ways to create movement

Dipartimento di Elettronica Informazione e Bioingegneria Robotics

Artificial Intelligence

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2 7 SEPTEMBER 2007

Booklet of teaching units

Overview Agents, environments, typical components

Artificial Intelligence: An overview

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw

Evolutionary Approaches to Neural Control in. Mobile Robots. Jean-Arcady Meyer. are [5], [56], [15] or [26].

Learning and Using Models of Kicking Motions for Legged Robots

Subsumption Architecture in Swarm Robotics. Cuong Nguyen Viet 16/11/2015

Arrangement of Robot s sonar range sensors

An Examination of the Static to Dynamic Imitation Spectrum

Evolution of Acoustic Communication Between Two Cooperating Robots

ENHANCED HUMAN-AGENT INTERACTION: AUGMENTING INTERACTION MODELS WITH EMBODIED AGENTS BY SERAFIN BENTO. MASTER OF SCIENCE in INFORMATION SYSTEMS

INTERACTIVE SKETCHING OF THE URBAN-ARCHITECTURAL SPATIAL DRAFT Peter Kardoš Slovak University of Technology in Bratislava

Rachid Alami and Felix Ingrand and Samer Qutub 1. of mobile robots, one can consider the whole eet or limit the

Communicating using filtered synchronized chaotic signals. T. L. Carroll

Lecture 1 : Introduction to Control Engineering

The Haptic Perception of Spatial Orientations studied with an Haptic Display

Natural Interaction with Social Robots

CN510: Principles and Methods of Cognitive and Neural Modeling. Neural Oscillations. Lecture 24

Creating Retinotopic Mapping Stimuli - 1

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT

Keywords: Multi-robot adversarial environments, real-time autonomous robots

Where: (J LM ) is the load inertia referred to the motor shaft. 8.0 CONSIDERATIONS FOR THE CONTROL OF DC MICROMOTORS. 8.

TIME encoding of a band-limited function,,

Spatial Sounds (100dB at 100km/h) in the Context of Human Robot Personal Relationships

Tone-in-noise detection: Observed discrepancies in spectral integration. Nicolas Le Goff a) Technische Universiteit Eindhoven, P.O.

Embodiment from Engineer s Point of View

Outline. What is AI? A brief history of AI State of the art

Transcription:

A neuronal structure for learning by imitation Sorin Moga and Philippe Gaussier ETIS / CNRS 2235, Groupe Neurocybernetique, ENSEA, 6, avenue du Ponceau, F-9514, Cergy-Pontoise cedex, France fmoga, gaussierg@ensea.fr http://www-etis.ensea.fr Abstract. In this paper 1, we present a neural architecture for a mobile robot in order to learn how to imitate a sequence of actions. We show that the use of a representation of the information in a continuous and dynamic way is necessary and the use of the neural elds can be a good solution to control the dynamic of several degrees of freedom with a single internal representation. 1 Introduction Until now, our work has been mainly focused on the design of a neural network architecture (named PerAc: Perception-Action) for the control of a visually guided autonomous robot. However, the PerAc architecture does not help to solve problems which have an intrinsic high dimension. Therefore imitation of already learned behaviors or subparts of a behavior not completely discovered is certainly one way to allow a population of animals or robots to learn and to nd solutions by themselves. Learning by imitation is already used in a few projects of Articial Intelligence (see [2, 3, 5]). In our previous work [6], we proposed a neural architecture for imitation based on visual information and we shown how to use it to teach the robot to perform a particular sequence of movements (to make a zigzag trajectory, a square...). In this paper we try to put together 2 ideas: how a PerAc architecture can be used for learning by imitation and how the properties of the neural elds can be used to improve the motor control. 2 Neural network for sequence imitation For the imitation behavior, we st with the assumption that proto imitation (not intentioned imitation) is triggered by a perception error (see [6] for details) and in Fig. 1 we present an overview of a general PerAc architecture using this principle. The reex path of PerAc works as a movement tracking mechanism which consists in going towards any perceived movement. The second level 1 In D. Floreano, J.-D. Nicoud, and F. Mondada, editors, Lecture Notes in Articial Intelligence - European Conference on Articial Life ECAL99, pages 314{318, Lausanne, September 1999.

2 Sorin Moga et al. of the architecture learns the temporal interval between the successive robot orientations (i. e. a sequence of movements), and associates it to a particular motivation. TB M TD d dt PO t CCD MI event prediction MO head rotation body rotation movement perception one to one link one to all link (Hebbian learning) Fig. 1. A general diagram of the PerAc architecture use for learning the temporal aspects of a trajectory. CCD - CCD camera, M - Motivations, MI - Movement Input, MO - Motor Output, TD - Time Derivator, TB - time battery, PO - Prediction Output A frame-grabber is used to take a sequence of images. In one of our simplest implementation, a \movement image" is the dierence between 2 dierent time integrated images of the above sequence. The perceived movement orientation is computed from the \movement image". The result is one-to-one \projected" on a map of analog formal neurons, the Motor Input (MI) group in Fig. 1. To avoid the perception errors in the tracking mechanism, we allow the robot camera (robot head) to rotate. In this way, the head tries to pursuit the teacher at any time by centering it in its visual eld. The robot body turns only if the teacher movement is observed under the same angle for a given time interval. The independent rotation of the robot head and its body can be viewed as a simple two degrees of freedom system. The functioning of the motor group (MO) is quite simple. At each step, a WTA mechanism chooses the most activated neuron, performs the rotation corresponding to this neuron and nishes with a xed translation. The MO group uses the same information representation as the MI group. It receives the information from both reex level and event prediction level. In order to learn a sequence, the student robot detects and learns the transitions in its own body orientation and to be able to reproduce them. The movement rotations characterized by OFF-ON transitions (Time Derivative TD group) of MO neurons are used as input information for a bank of spectral neurons (TB in Fig. 1). Time lter batteries (TB) act as delay neurons endowed with dierent time constants. As such, they perform a spectral decomposition of the signal that will allow the neurons in the Prediction Output group (PO) to store the transition patterns between two events in the sequence. Finally, the PO group is linked with the MO group via one-to-all modiable links.

Neuronal structure for learning by imitation 3 3 An neural dynamics of the motor system The rst limitation in our architecture is the poor stability of the tracking behavior. Even if the temporal integration allows a memory eect, any new input stimulus can generate an immediate change of the head orientation (a classical WTA decision). A second major limitation is the input discrimination. Two or more movement zones can be interpreted as dierent gets or as the same get due to perception error. In the present system, no interpretation of the perceived movement is performed in order to avoid a misinterpretation. The motor group has to be a topological map of neurons using a dynamical integration of the input information to avoid forgetting the previously tracked get. A dynamical competition has also to be used to avoid intermittent switchings from a given get to another. We will use the simplied formulation of the neural eld proposed and studied by Amari [1]. Z f (x; t) =?f (x; t) + I (x; t) + h + w(z) g (f(x? z; t)) dz (1) dt z2vx Without inputs, the homogeneous pattern of the neural eld, f (x; t) = h, is stable. The inputs of the system, I (x; t), represent the stimuli information which excite the dierent regions of the neural eld and is the relaxation rate of the system. w(z) is the interaction kernel in the neural eld activation. These lateral interactions (\excitatory" and \inhibitory") are modeled by a DOG function. V x is the lateral interaction interval. g (f (x; t)) is the activity of the neuron x according to its potential f (x; t). We use a classic ramp function. G. Schoner [7, 4] has proposed to use the properties of the neural eld for motor control problems. The \read-out" mechanism consists in the use of the derivate of the neural eld activation to compute the motor command. The orientation of the robot head, rob, relative to a xed reference is used in the system as a behavioral variable. The state of the system is expressed as a value of this variable. The local maxima of the neural eld are named attractors. If the get orientation is (see Fig. 2, a), it erects an attractor in the neural eld (see Fig. 2, b) and the robot rotation speed will be! = _ = F ( rob ). _ is a function of the current robot orientation, rob. It sets the dynamics of our robot. Taken separately, each input erects an attractor in the neural eld. The Amari's equation allows the cooperation for coherent inputs associated with dierent goals (spatially separated gets). For closely spaced input information, the dynamic has a single attractor corresponding to the average of the input information. For a critical distance between inputs, a bifurcation point appears and the previous attractor becomes a repellor and 2 new attractors emerge. Depending on the initial state, the robot switches to one of the 2 new xed points. This mechanism of input competition / cooperation has an hysteresis properties which avoids oscillations between the two possible behaviors. Another feature of

4 Sorin Moga et al. Ρ (neural field activation) Robot head 11111111111 1 1111 ω 1111 11 get rob rob d Ρ/ d ω rob a) b) Fig. 2. a) The robot and the get coordinates are represented in the same reference. The reference orientation, is used to compute rob and. b) The get position erects an attractor at. The \read-out" mechanism allows to compute the rotation speed! using the derivate of the neural eld activation. the neural eld is the memory. If the parameter h in Eq. (1) has a suciently negative value then the neural eld operates with a memory eect in which a peak of an attractor has been maintained for a short time interval. A large positive value of h determines a supra-threshold in the neural eld activation. We use the inputs of the actual system to drive a motor command using a neural eld without any modication. Replacing the MO group by a neural eld is the sole modication in the architecture (see Fig. 1). All above properties of the neural eld come into the general architecture, eliminating the input segmentation and the stability problem of the initial architecture. 4 Experimental results and discussion At rst, we have implemented the tracking reex using only one degree of freedom, i. e. the robot moves only its head. In order to demonstrate the capabilities of neural eld to control several degrees of freedom we take a simple example. The robot follows a \teacher" and learns a sequence of movements ABC. The sequence sts with the activation of the state A (orientation) corresponding neuron. The input in the neural eld generates an attractor at the the A orientation (see Fig. 3). At moment, the B neuron will be activated by the PO group. This activation shifts the attractor to B in the neural eld. Using the \read-out" mechanisms, we obtain 2 rates of orientation change (due to dierences inertia): one for the head orientation and another one for the robot body orientation. In the top of the Fig. 3, we show the variation of head and body orientation as a function of time. According to neural eld dynamics, the change of the orientation is continuous. For an external observer, the head orientation anticipates the body orientation ( i.e. the inertia of the robot is learned too).

Neuronal structure for learning by imitation 5 36 B A A A 36 τ B 36 B 36 head orientation B 36 body orientation B B 36 36 36 C time neural field activation A B C (sequence) Fig. 3. Top: the temporal variation of the head and of the body orientation. Bottom: the neural eld activation for an ABC sequence. The bar represents the predicted movement. This work is at its beginning. Its interest is in its use of the neural eld concept in a PerAc architecture. We show that we can learn the temporal sequence of movements by imitation using a PerAc architecture. The tracking mechanism in the reex path of PerAc permits the temporal \segmentation" of the \teacher" movements without learning to visualize what the teacher is doing or not. The use of the neural eld improves the stability of the proto imitation process and permit the discrimination of moving objets in the visual perception eld. References 1. S. Amari. Dynamics of pattern formation in lateral-inhibition type neural elds. Biological Cybernetics, 27:77{87, 1977. 2. P. Bakker and Y. Kuniyoshi. Robot see, robot do : An overview of robot imitation. In AISB Workshop on Learning in Robots and Animals, Brighton, UK, 1996. 3. L. Berthouze and Y. Kuniyoshi. Emergence and categorization of coordinated visual behavior through embodied interaction. Machine Learning, 31(1/2/3):187{2, 1998. 4. E. Bicho and G. Schoner. The dynamic approach to autonomous robotics demonstrated on a low-level vehicle platform. Robotics and Autonomous Systems, 21:23{35, 1997. 5. John Demiris and Gillian Hayes. Imitative learning mechanisms in robots and humans. In Proceedings of the 5th European Workshop on Learning Robots, Bari, Italy, July 1996. 1996. 6. P. Gaussier, S. Moga, M. Quoy, and J.P. Banquet. From perception-action loops to imitation processes: a bottom-up approach of learning by imitation. Applied Articial Intelligence, 12(7-8):71{727, Oct-Dec 1998. 7. G. Schoner, M. Dose, and C. Engels. Dynamics of behavior: theory and applications for autonomous robot architectures. Robotics and Autonomous System, 16(2-4):213{ 245, December 1995.