Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization

Similar documents
Distributed Intelligent Systems W11 Machine-Learning Methods Applied to Distributed Robotic Systems

Multi-Robot Learning with Particle Swarm Optimization

CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS

Biologically Inspired Embodied Evolution of Survival

Evolutionary Robotics. IAR Lecture 13 Barbara Webb

On The Role of the Multi-Level and Multi- Scale Nature of Behaviour and Cognition

Implicit Fitness Functions for Evolving a Drawing Robot

Evolution of Acoustic Communication Between Two Cooperating Robots

Behaviour Patterns Evolution on Individual and Group Level. Stanislav Slušný, Roman Neruda, Petra Vidnerová. CIMMACS 07, December 14, Tenerife

PES: A system for parallelized fitness evaluation of evolutionary methods

Evolved Neurodynamics for Robot Control

Holland, Jane; Griffith, Josephine; O'Riordan, Colm.

EMERGENCE OF COMMUNICATION IN TEAMS OF EMBODIED AND SITUATED AGENTS

Evolution of communication-based collaborative behavior in homogeneous robots

Subsumption Architecture in Swarm Robotics. Cuong Nguyen Viet 16/11/2015

! The architecture of the robot control system! Also maybe some aspects of its body/motors/sensors

Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots

Behavior Emergence in Autonomous Robot Control by Means of Feedforward and Recurrent Neural Networks

Evolving CAM-Brain to control a mobile robot

Evolving non-trivial Behaviors on Real Robots: an Autonomous Robot that Picks up Objects

Cooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution

Evolutionary robotics Jørgen Nordmoen

Evolving communicating agents that integrate information over time: a real robot experiment

Evolving Mobile Robots in Simulated and Real Environments

Neuro-Fuzzy and Soft Computing: Fuzzy Sets. Chapter 1 of Neuro-Fuzzy and Soft Computing by Jang, Sun and Mizutani

Online Interactive Neuro-evolution

Socially-Mediated Negotiation for Obstacle Avoidance in Collective Transport

Evolutions of communication

Minimal Communication Strategies for Self-Organising Synchronisation Behaviours

SWARM-BOT: A Swarm of Autonomous Mobile Robots with Self-Assembling Capabilities

CORC 3303 Exploring Robotics. Why Teams?

Evolving Spiking Neurons from Wheels to Wings

Ezequiel Di Mario, Iñaki Navarro and Alcherio Martinoli. Background. Introduction. Particle Swarm Optimization

GPU Computing for Cognitive Robotics

Multi-Robot Coordination. Chapter 11

Space Exploration of Multi-agent Robotics via Genetic Algorithm

Evolving Neural Mechanisms for an Iterated Discrimination Task: A Robot Based Model

Université Libre de Bruxelles

Negotiation of Goal Direction for Cooperative Transport

A Divide-and-Conquer Approach to Evolvable Hardware

Efficiency and Optimization of Explicit and Implicit Communication Schemes in Collaborative Robotics Experiments

Once More Unto the Breach 1 : Co-evolving a robot and its simulator

Modeling Swarm Robotic Systems

Introduction to Embedded and Real-Time Systems W10: Hardware Design Choices and Basic Control Architectures for Mobile Robots

biologically-inspired computing lecture 20 Informatics luis rocha 2015 biologically Inspired computing INDIANA UNIVERSITY

A Genetic Algorithm-Based Controller for Decentralized Multi-Agent Robotic Systems

Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors

Considerations in the Application of Evolution to the Generation of Robot Controllers

CSCI 445 Laurent Itti. Group Robotics. Introduction to Robotics L. Itti & M. J. Mataric 1

Robotic Systems ECE 401RB Fall 2007

A Robust Neural Robot Navigation Using a Combination of Deliberative and Reactive Control Architectures

Using Cyclic Genetic Algorithms to Evolve Multi-Loop Control Programs

Coevolution of Heterogeneous Multi-Robot Teams

Evolution of Embodied Intelligence

Breedbot: An Edutainment Robotics System to Link Digital and Real World

A Comparison of PSO and Reinforcement Learning for Multi-Robot Obstacle Avoidance

COMPUTATONAL INTELLIGENCE

THE EFFECT OF CHANGE IN EVOLUTION PARAMETERS ON EVOLUTIONARY ROBOTS

Reactive Planning with Evolutionary Computation

Review of Soft Computing Techniques used in Robotics Application

Collective Robotics. Marcin Pilat

Online Evolution for Cooperative Behavior in Group Robot Systems

Genetic Evolution of a Neural Network for the Autonomous Control of a Four-Wheeled Robot

Enhancing Embodied Evolution with Punctuated Anytime Learning

Negotiation of Goal Direction for Cooperative Transport

A neuronal structure for learning by imitation. ENSEA, 6, avenue du Ponceau, F-95014, Cergy-Pontoise cedex, France. fmoga,

Socially-Mediated Negotiation for Obstacle Avoidance in Collective Transport

Chapter 1: Introduction to Neuro-Fuzzy (NF) and Soft Computing (SC)

Available online at ScienceDirect. Procedia Computer Science 24 (2013 )

Darwin + Robots = Evolutionary Robotics: Challenges in Automatic Robot Synthesis

Formica ex Machina: Ant Swarm Foraging from Physical to Virtual and Back Again

CS594, Section 30682:

GA-based Learning in Behaviour Based Robotics

The Behavior Evolving Model and Application of Virtual Robots

Adaptive Neuro-Fuzzy Controler With Genetic Training For Mobile Robot Control

RoboCup. Presented by Shane Murphy April 24, 2003

1) Complexity, Emergence & CA (sb) 2) Fractals and L-systems (sb) 3) Multi-agent systems (vg) 4) Swarm intelligence (vg) 5) Artificial evolution (vg)

from AutoMoDe to the Demiurge

Evolving Robot Behaviour at Micro (Molecular) and Macro (Molar) Action Level

Institute of Psychology C.N.R. - Rome. Evolving non-trivial Behaviors on Real Robots: a garbage collecting robot

Biological Inspirations for Distributed Robotics. Dr. Daisy Tang

ALife in the Galapagos: migration effects on neuro-controller design

Supplementary information accompanying the manuscript Biologically Inspired Modular Neural Control for a Leg-Wheel Hybrid Robot

61. Evolutionary Robotics

Neural Networks for Real-time Pathfinding in Computer Games

EVOLUTIONARY ROBOTS: THE NEXT GENERATION

One computer theorist s view of cognitive systems

Body articulation Obstacle sensor00

Behaviour-Based Control. IAR Lecture 5 Barbara Webb

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT

Efficient Evaluation Functions for Multi-Rover Systems

Synthetic Brains: Update

Université Libre de Bruxelles

Structure and Synthesis of Robot Motion

Université Libre de Bruxelles

A colony of robots using vision sensing and evolved neural controllers

Behavior and Cognition as a Complex Adaptive System: Insights from Robotic Experiments

Overview Agents, environments, typical components

Université Libre de Bruxelles

Effect of Sensor and Actuator Quality on Robot Swarm Algorithm Performance

Transcription:

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization

Learning to avoid obstacles Outline Problem encoding using GA and ANN Floreano and Mondada experiment Dealing with noise and comparison with PSO Noise-resistant GA Noise-resistant PSO Pugh et al. systematic study Moving beyond obstacle avoidance Learning of more complex behaviors HW&SW co-design Specific learning issues in collective systems

Learning to Avoid Obstacles by Shaping a Neural Network Controller using Genetic Algorithms

Evolving a Neural Controller N i f(x i ) w ij i synaptic weight f ( x) 2 1 I j input O i output neuron N with sigmoid transfer function f(x) O = x i = = x m f j= 1 ( x i 1+ e w ij I ) j + I 0 S 1 S 2 S 3 S 4 S 5 S 8 S 7 S 6 M 1 M 2 inhibitory conn. excitatory conn. Note: In our case we evolve synaptic weigths but Hebbian rules for dynamic change of the weights, transfer function parameters, can also be evolved (see Floreano course)

Evolving Obstacle Avoidance (Floreano and Mondada 1996) Defining performance (fitness function): Φ = V ( 1 V )(1 i) V = mean speed of wheels, 0 V 1 v = absolute algebraic difference between wheel speeds, 0 v 1 i = activation value of the sensor with the highest activity, 0 i 1 Note: Fitness accumulated during evaluation span, normalized over number of control loops (actions).

Evolving Robot Controllers Note: Controller architecture can be of any type but worth using GA/PSO if the number of parameters to be tuned is important

Evolving Obstacle Avoidance Evolved path Fitness evolution

Evolved Obstacle Avoidance Behavior Generation 100, on-line, off-board (PC-hosted) evolution Note: Direction of motion NOT encoded in the fitness function: GA automatically discovers asymmetry in the sensory system configuration (6 proximity sensors in the front and 2 in the back)

Noise-Resistant GA and PSO for Design and Optimization of Obstacle Avoidance

Noisy Optimization Multiple evaluations at the same point in the search space yield different results Depending on the optimization problem the evaluation of a candidate solution can be more or less expensive in terms of time Causes decreased convergence speed and residual error Little exploration of noisy optimization in evolutionary algorithms, and very little in PSO

Key Ideas Better information about candidate solution can be obtained by combining multiple noisy evaluations We could evaluate systematically each candidate solution for a fixed number of times not smart from computational point of view In particular for long evaluation spans, we want to dedicate more computational power/time to evaluate promising solutions and eliminate as quickly as possible the lucky ones each candidate solution might have been evaluated a different number of times when compared In GA good and robust candidate solutions survive over generations; in PSO they survive in the individual memory Use aggregation functions for multiple evaluations: ex. minimum and average

GA PSO

Example: Gaussian Additive Noise on Generalized Rosenbrock Fair test: same number of evaluations candidate solutions for all algorithms (i.e. n generations/ iterations of standard versions compared with n/2 of the noise-resistant ones)

A Systematic Study on Obstacle Avoidance 3 Different Scenarios Scenario 1: One robot learning obstacle avoidance Scenario 2: One robot learning obstacle avoidance, one robot running pre-evolved obstacle avoidance Scenario 3: Two robots co-learning obstacle avoidance PSO, 50 iterations, scenario 3 Idea: more robots more noise (as perceived from an individual robot); no standard com between the robots but in scenario 3 information sharing through the population manager!

Fair test: same number of evaluations of candidate solutions for all algorithms (i.e. n generations/ iterations of standard versions compared with n/2 of the noise-resistant ones) Results Best Controllers

Results Average of Final Population Fair test: idem as previous slide

Results Scenario 1, Population Fitness Evolution Fair test: idem as previous slide

Not only Obstacle Avoidance: Evolving More Complex Behaviors

Evolving Homing Behavior (Floreano and Mondada 1996) Set-up Robot s sensors

Evolving Homing Behavior Fitness function: Controller Φ = V ( 1 i) V = mean speed of wheels, 0 V 1 i = activation value of the sensor with the highest activity, 0 i 1 Fitness accumulated during life span, normalized over maximal number (150) of control loops (actions). No explicit expression of battery level/duration in the fitness function (implicit). Chromosome length: 102 parameters (real-to-real encoding). Generations: 240, 10 days embedded evolution on Khepera.

Evolving Homing Behavior Fitness evolution Evolution of # control loops per evaluation span Battery recharging vs. motion patterns Battery energy Left wheel activation Right wheel activation Reach the nest -> battery recharging -> turn on spot -> out of the nest

Evolved Homing Behavior

Not only Control Shaping: Off-line Automatic Hardware-Software Co- Design and Optimization

Moving Beyond Controller-Only Evolution Evidence: Nature evolve HW and SW at the same time Faithful realistic simulators enable to explore design solution which encompasses off-line coevolution (co-design) of control and morphological characteristics (body shape, number of sensors, placement of sensors, etc. ) GA (PSO?) are powerful enough for this job and the methodology remain the same; only encoding changes

Evolving Control and Robot Morphology (Lipson and Pollack, 2000) http://www.mae.cornell.edu/ccsl/research/golem/index.html Arbitrary recurrent ANN Passive and active (linear actuators) links Fitness function: net distance traveled by the centre of mass in a fixed duration Example of evolutionary sequence:

Examples of Evolved Machines Problem: simulator not enough realistic (performance higher in simulation because of not good enough simulated friction; e.g., for the arrow configuration 59.6 cm vs. 22.5 cm)

From Single to Multi-Unit Systems: Co-Learning in a Shared World

Evolution in Collective Scenarios Collective: fitness become noisy due to partial perception, independent parallel actions

Credit Assignment Problem With limited communication, no communication at all, or partial perception:

Co-Learning Collaborative Behavior Three orthogonal axes to consider (extremities or balanced solutions are possible): Individual and group fitness Private (non-sharing of parameters) and public (parameter sharing) policies Homogeneous vs. heterogeneous systems Example with binary encoding of candidate solutions

Co-Learning Competitive Behavior fitness f 1 fitness f 2

Co-Learning in a Competitive Framework

Co-Evolution of Competitive Behavior No credit assignment problem: individual fitness! Example: co-evolution of prey s and predator s controllers Φ prey = T Φ predator = 1-T T = normalized time of survival of the prey (before is caught)

Prey-Predator Experiment (Nolfi and Floreano, 1998)

Prey-Predator Experiment Fitness co-evolution in simulation Fitness co-evolution in real robots

Co-learning in a Collaborative Framework

Learning to Aggregate See Lab+Hwk 5 ANN architecture, 2 neurons Parameter space larger: range and bearing information added but only the average over all detected robots within a given range fed to the ANN. Number of parameters: Obstacle avoidance (proximity-to-motors+lateral+recursive+bias): 16+2+2+2 = 22 weights Aggregation (obstacle avoidance + range&bearing-to-motors): 22 + 4 = 26 weights Individual fitness: number of robots around robot i; group fitness: average over all the measurements taken by individual robots Individual & Public & Heterogeneous vs. Group & Public & Homogeneous Preliminary results: Heterogeneous PSO/GA faster (exploit multi-robot parallel platform) Homogeneous PSO/GA potential higher fitness at the end but slower No major benefit of enforced homogeneity since in this case individual and group fitness very much aligned and only limited number of iteration/generation considered

Learning to Pull Sticks Homogeneous and heterogeneous learning Diversity & specialization Simple in-line adaptive learning algorithm All applied to the stick-pulling case study See next week after lecture on multi-level modeling!

Additional Literature Week 7 Books Nolfi S. and Floreano D., Evolutionary Robotics: The Biology, Intelligence, and Technology of Self-Organizing Machines. MIT Press, 2004 Sutton R. S. and Barto A. G., Reinforcement Learning: An Introduction. The MIT Press, Cambridge, MA, 1998. Papers Lipson, H., Pollack J. B., "Automatic Design and Manufacture of Artificial Lifeforms", Nature, 406: 974-978, 2000. Murciano A. and Millán J. del R., "Specialization in Multi-Agent Systems Through Learning". Biological Cybernetics, 76: 375-382, 1997. Dorigo M., Trianni V., Sahin E., Groß R., Labella T., Nolfi S., Baldassare G., Deneubourg J.-L., Mondada F., Floreano D., and Gambardella L.. Evolving Selforganising Behaviours for a Swarm-bot. Autonomous Robots, 17:223 245, 2004 Mataric, M. J. Learning in behavior-based multi-robot systems: Policies, models, and other agents. Special Issue on Multi-disciplinary studies of multi-agent learning, Ron Sun, editor, Cognitive Systems Research, 2(1):81-93, 2001. Nolfi S. and Floreano D. Co-evolving predator and prey robots: Do 'arm races' arise in artificial evolution? Artificial Life, 4 (4): 311-335, 1999. Antonsson E. K, Zhang Y., and Martinoli A., Evolving Engineering Design Trade- Offs. Proc. of the ASME Fifteenth Int. Conf. on Design Theory and Methodology, September 2003, Chicago, IL, USA, paper No. DETC2003/DTM-48676.