Behavior-based robotics, and Evolutionary robotics

Behavior-based robotics, and Evolutionary robotics Lecture 7 2008-02-12

Contents Part I: Behavior-based robotics: Generating robot behaviors. MW p. 39-52. Part II: Evolutionary robotics: Evolving basic behaviors. MW p. 53-74. + scientific papers

Behavior-based robotics -Generating robot behaviors-

Machine intelligence Scientific field founded in the 1950s The goal of Machine intelligence: "Generate machines capable of displaying human-level intelligence." ge a m di te h rig y op c Reason, make plans, and carry out actions

Milestone I: The Turing test 1950, The imitation game: ge a m di te h rig y op c By asking a series of questions, an observer has to determine which one is the machine, and which one is the human. [Computing machinery and intelligence]

Milestone I: The Turing test Goal of the machine: fool the observer into believing that it is the person. ge a m i d e ht rig y p co Turing: If a machine acts as intelligently as a human, then it is as intelligent as a human

The Loebner Prize in Artificial Intelligence Pass the Turing test, and win US $100000! ge a m di te h rig y op c The most human-like computer is awarded US $3000!

Milestone II: Dartmouth Proposal for the Dartmouth Summer Research Project on Artificial Intelligence: We propose that a 2 month, 10 man study of artificial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. [... ] We think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer.

The three goals of AI: Strong AI: build machines whose overall intellectual capability is impossible to differentiate from that of human beings (weak AI: computers can only appear to think) Applied AI: produce commercially viable expert systems Cognitive simulation employ computers to test theories about how the human mind works

The AI approach: The sense-plan-act (SPA) paradigm: perception build a world model (usually very complex) planning: reason about actions decide upon which action to take execute an action in the real world Requires computational power, and lot's of memory! Good for game-playing programs, natural language interpreters, and expert systems!

We're still waiting... Only machines that display a limited amount of intelligent behavior have been built so far... Carrying a table Assemblying a panel HRP-2, Kawada Industries, Japan d e t igh yr p co age im

why...? Intelligence is hard to define Human-level intelligence is extremely complex => Human-level intelligence is hardly the best starting point Preoccupation with human-level intelligence probably the largest obstacle to progress BBR takes a broader view of intelligence: [Intelligent behavior] is the ability to survive, and to strive to reach other goals in an unstructured environment

Behavior-based robotics (BBR) Pioneered by Rodney Brooks (in the 1980s) Subsumption architecture No central world model Network of simple components (behaviors) Parallel, asynchroneous information processing No global memory: direct communication between modules Built incrementally Behaviors activated by stimuli Strongly influenced by biology and ethology Intelligence an emergent phenomena!

Classical AI vs. BBR A comparison of the information flow in AI and in BBR

An example from biology: Bats (predator) & moths (prey): ge a m di te h rig y p co Despite that moths have the simplest auditory system among insects, they can escape bats! Two or four neurons => Can't be SPA paradigm!

Behaviors and actions Behavior is a sequence of actions performed in order to achieve some goal. Example: The behavior of obstacle avoidance may consist of the actions of stopping, turning, and starting to move again (in a different direction). Note: may be used differently by other authors!

Intelligent behavior and reasoning Intelligent behavior does not require reasoning in the BBR approach Most biological organisms are capable of highly intelligent behavior in their natural environment, but they may fail badly in novel environments. Unstructured environments rapidly changes => pre-defined maps are of little use there!

Features of BBR BBR is concerned with autonomous robots Behavior-based robots are first provided with basic behaviors: Obstacle avoidance, battery charging More complex behaviors are then added incrementally

Features of BBR The brain of a BB robot consists of a set of basic behaviors, the behavioral repertoire: The behavioral selection system is just as important as the individual behaviors!

Features of BBR Behavior-based robots generally operate in the real world, i.e. they are situated The behaviors that a robot develops depend on the interactions with the environment, and the properties of the robot itself. In fact, Turing anticipated the situated approach!

Generating behaviors A robot's most fundamental behaviors are those that deal with its survival : collision avoidance, battery charging, etc. A robot must also avoid harming people! Asimov's three laws serve as an inspiration:

Braitenberg vehicles Direct sensor-actuator mapping can make robots display basic intelligent behavior: The Pursuer Vehicles: Experiments in Synthetic Psychology State 1: ML=0.5 MR=0.5 SR < C 1 SL > C1 State 2: ML=0.5 MR=0.0

Behavioral architectures If-then-else rules and Boolean state variables: Finite state machines (FSMs) Hand-coded behaviors: See the wandering example p. 47-51 in ch.3 Artificial neural networks: Difficult to generate by hand Biological organisms often serve as an inspiration But anything that works is correct!

Evolutionary robotics

Evolutionary robotics (ER) ER is a subfield of robotics, in which evolutionary algorithms (EAs) are used for generating robotic brains, or bodies, or both.

Approaches to ER: Evaluate in simulator... or directly in robot

Issues in ER Representations: ANNs, FSMs, hand-coded rules, etc... Fitness measures: EAs are good at finding loopholes! Usually, a lot of testing required! Simulation vs. evolution in real robots: Evolution in hardware: Timeconsuming Evolution in simulations: Reality gap! Embodied evolution: population of robots

Fitness measures Explicit: Consider detailed aspects Implicit: Consider overall behavior Local: Updates fitness at every timestep Global: Looks at final state Internal: Based only on information availible to the robot External: Uses global information

Application examples in ER: Evolution of garbage collection, or cleaning behavior, in simulation [Application 1] Online optimization of gaits in real, physical robots [Applications 2 and 3]. Optimization of the structure and the parameters of gait control programs based on CPGs [Application 4].

Application 1 Garbage collection Objective: Generate a brain capable of making the robot clean the arena from cylindrical objects, by means of an EA Evolve in simulation, then transfer the best robotic brain to a real, physical robot

Application 1 Garbage collection Cleaning behavior: Initial, and final states: Fitness: sum of all objects mean square distance, from the center of the arena,

Application 1 Garbage collection Representation: M states, and conditional jumps Rules, e.g: IF s > s0: jump to state j

Application 1 Garbage collection Khepera robot ge a m di te h rig y op c

Application 1 Garbage collection Results:

Bipedal walking: Static walking: Stable at all times (w.r.t. CoM)! Dynamic walking: Not always at static equilibrium! ge a m di te h g yri p o c

Zero-moment point (ZMP) ZMP: the contact point between the ground and the foot sole of the supporting leg, where the torques around the horizontal axes, generated by all forces acting on the robot, are equal to zero. During a dynamically balanced gait, the ZMP can only move within the supporting area. ZMP

Zero-moment point (ZMP) Moment balance around the ZMP: ZMP equations:

Control methods Biped locomotion control Tracking control Off-line trajectory generation Passive dynamic control Real-time motion control Bio-inspired control Bio-inspired computational methods: EAs ANNs Bio-inspired motor system design: CPGs Bio-inspired methods do not require accurate models or reference trajectories for execution!

Application 2 Online optimization of gaits in a real, physical robot I

Application 2 Evolution of efficient gait with humanoids using visual feedback K. Wolff, and P. Nordin. Humanoids 2001 Complex Adaptive Systems Group, Chalmers University of Technology, Göteborg, Sweden

Application 2 The robot Humanoid robot Elvina 28 cm tall fully autonomous robot vision and proximity 14 dof

Application 2 Experiment set-up Objective: optimize the robots gait: Make it walk faster, straighter, and in a more robost way, than it previously did.

Application 2 Representation A chromosome, specifing a gait cycle: 2, 5, 2, 3, 3, 5, 2, 4, 3, 80, 100, 4, 136, 127, 107, 249, 106, 182, 99, 128, 150, 42, 81, 84, 5, 136, 29, 106, 242, 127, 180, 100, 128, 152, 300, 80, 84, 4, 136, 16, 12, 94, 252, 169, 100, 128, 150, 292, 74, 89, 5, 135, 14, 78, 171, 253, 174, 100, 128, 151, 108, 79, 165, 4, 157, 127, 137, 251, 149, 172, 104, 128, 150, 55, 85, 149, 3, 154, 214, 129, 252, 161, 177, 97, 128, 150, 300, 92, 12, 157, 248, 215, 132, 250, 164, 179, 101, 128, 150, 214, 89, 13, 81, 192, 215, 133, 252, 165, 183, 99, 128, 151, 42, 90, 103, 5, 137, 131, 107, 244, 106, 185, 101, 128, 151, 157,

Application 2 Gait Elvina s walking cycle:

Application 2 Implementation Standard GA, tournament selection Creep mutation Mean value-crossover

Application 2 Evolutionary algorithm Implementation Population 30 individuals Individuals randomly created with a uniform distribution of genes, over a given, empirical search range Steady-state tournament selection Crossover: Mutation:

Application 2 Fitness The camera is used to determine how straight the robot moved during the trial. The angular deviation, Θ, is the difference from the desired (straight) path of locomotion and the performed path.

Application 2 Fitness Fitness is a product of walking velocity and how straight the robot walked:

Application 2 Results The best evolved individual fitness: 0.17 The best hand-coded gait fitness: 0.11, i.e. 55% improvement (mostly due to a straighter path of locomotion)!

Application 2 Conclusions from applications 2 Lesson learned: Evolving efficient gaits with real physical hardware is a challenging task It is time consuming. Feedback is slow, and the experiment requires manual supervision all the time. It is extremely demanding for the hardware! On-line evolution in hardware constrains the number of generations.

Application 3 Online optimization of gaits in a real, physical robot II

Application 3 Evolutionary Optimization of a Bipedal Gait in a Physical Robot K. Wolff, D. Sandberg, M. Wahde. CEC 2008 (accepted) Adaptive Systems Research Group, Chalmers University of Technology, Göteborg, Sweden

Application 3 EA in a real robot The Kondo robot 17 DOFs No sensors FAST!

Application 3 Experiment Online optimization of hand-coded gait pattern Similar to previous experiment, but new states were added.

Application 3 Fitness TSG = time for individual executing the standard gait.

Application 3 Standard gait and best gait

Application 3 Gait

Application 3 Best evolved gait Movie:

Application 3 Conclusions from applications 2 and 3 Application 2: A more stable gait was obtained. Application 3: The walking speed increased by 65%. Structural modifications of the gait program. Possible to obtain significant improvements of bipedal gaits with an EA in a real physical bipedal robot. Typical experiment duration: 24 man-hours (Application 3, 900 evaluated individuals).

Application 4 Structural evolution of central pattern generators for bipedal walking in 3D simulation K. Wolff, J. Pettersson, A. Heralic, M. Wahde. Adaptive Systems Research Group, Chalmers University of Technology, Göteborg, Sweden

Application 4 Project Objective Bipedal gait synthesis for a simulated robot by structural evolution of CPG networks: CPG network parameters and feedback network interconnection paths are determined using an EA.

Application 4 Motor Systems Hierarchy Two modes of muscular control of flexorextensor pairs: Phasic activated transiently to make discrete movements; walking, swimming etc. Tonic steady contractions, posture, gripping something

Application 4 Motor Systems Hierarchy Key elements: Central pattern generator (CPG) Higher motor centers Feedback circuits Hierarchical organization: Allows for the lower levels to control reflexes Higher levels give commands without having to specify the details HIGHER CENTERS: BRAIN Higher Control Central Feedback (Efference copy) LOWER CENTERS: SPINAL CORD CPGs Sensory Input Reflex Feedback MUSCLES Effector Organs Motor Output Environment

Application 4 The robot

Application 4 Central Pattern Generators CPGs are neural circuits capable of producing oscillatory output given tonic (non-oscillating) input CPGs have been extensively studied in animals: simple animals; lamprey, salamander complex animals; cats Observations support the notion of CPGs in humans: treadmill training of patients with spinal cord lesion

Application 4 The Matsuoka oscillator ui = inner state vi = degree of self inhibition τu and τv time constants u0 = bias (tonic input) wij = connection weights yi = output

Application 4 The Matsuoka oscillator Frequency variation occurs if the time constants τu and τv are varied.

Application 4 The Matsuoka oscillator Amplitude variation occurs if the bias u0 is varied

Application 4 CPG network An arrow indicates the possibility of connections

Application 4 Feedback network Waist, thigh, and leg angles, and foot contact

Application 4 GA optimization Difficult to tune parameters and structure of CPG networks => optimal performance cannot be guaranteed! EAs are good at open-ended optimization.

Application 4 Support structure A massless support structure was used in the early stages of the EA runs, in order to generate natural, upright gaits. Helps the robot to balance.

Application 4 Evolutionary algorithm Objective function: f (i) = x - y [Distance walked forward ] [sideways deviation]

Application 4 Evolutionary algorithm A standard GA Population of 180 individuals Mutation, no crossover Tournament selection, size: 8, psel = 0.75 Fitness function: f = x - y [Distance walked forward ] [sideways deviation]

Application 4 Evolutionary algorithm Genome, fixed length CPG network chromosome: len: 32, binary value, connection[i] = 0, 1 len: 32, real value, weights (sign and strength) Feedback network: len: 20, real value, weights (sign and strength) Three chromosomes with 84 genes

Application 4 Results Fitness progress: Fitness landscape with sparse, narrow peaks (low average fitness after many generations).

Application 4 Results Best individual (movie) Stop and go Change gaits

Application 4 Conclusions from application 4 Stable bipedal gait was generated. Support structure: Four point did not help much (=> cheating) Two point support was useful Without support, often stuck in local optima More feedback could lead to improved control and robustness Only straight line locomotion has been investigated in this study! Transfer the results to a real robot in the future.

Evolving behaviors with ERSim Use ERSim to experiment a little on your own!

Thank you for your attention!