Control System Architectures for Autonomous Agents

Control System Architectures for Autonomous Agents Lennart Pettersson Mechatronics Division, Department of Machine Design, Royal Institute of Technology, Stockholm, Sweden E-mail: lennartp@damek.kth.se A Survey Study 1997-04-01 This paper is a survey study of the research that has been done in designing control system architectures for autonomous mobile robots. The architectures are discussed with respect to properties such as robustness, learning ability, processing speed and organization. A number of deliberative, blackboard, hierarchical, reactive and hybrid architectures will be taken as examples and given a short presentation. 1.0 Introduction An autonomous mobile system is a robot that is capable of interacting and performing complicated tasks in an unknown and changing environment. The design of a control system capable of controlling such a system to work successfully in the real world is a challenging task. One would require the control system to be able to present some form of intelligent behavior to solve different problems and to use its own experience in a seemingly intelligent way. At the same time one would require it to be able to cope with unforeseen changes in the world and to react instantaneously to outer stimuli. One would also require the functionality to degrade smoothly due to internal faults or outer obstructions. Numerous different approaches to the problem of designing a control system for a robot have been used in different areas of research. For many years the AI community has built control systems that presents intelligent behavior but usually not in the real world but in some simulated or specially designed environment. Traditional real-time control research has designed robust control systems acting in the real world but usually without truly autonomous or intelligent behavior. A lot of research has been centered on trying to emulate, on different levels, the function of the control systems of the most perfectly autonomous systems we know - the nervous systems of humans and animals. This paper will give an overview of different approaches that have been used to build control systems for autonomous agents. It is divided into three parts after the introductory part (chapters 1 and 2). The first part (chapter 3) will describe different techniques that have been used to build control systems for autonomous agents and discuss how different architectures can be classified. The next part (chapter 4) gives very short and simplified descriptions of a number of architectures. For more complete descriptions references are given. In the last two chapters (6 and 7) we compare the architectures and techniques described in the earlier chapters and discuss them with respect to a number of relevant properties. 1 of 16

Definitions There is a lot of research going on worldwide in this area so a complete survey of all relevant research would be unfeasible. To get a survey of manageable size a supposedly representative selection has been made. All references to different research institutes and projects should therefore be looked upon as examples only. 2.0 Definitions In books and articles about robotics and intelligent systems most authors use a large number of terms such as agent and autonomy without stating any explicit definition. This can be rather confusing since there is no common definition of the terms and since the implicit definitions used often differ a lot. This chapter will give an overview of definitions of some of the most important terms and also state what definition will used in this paper. The definitions used will not in any way be the only possible or even in some sense the best. The reason to state a definition is only to get a consistent use of words throughout this paper. Definition of agent: One definition of agent that focuses on the survivability of the system is from Steels (1995). An agent is a system that is capable to maintain itself. Another common definition which focuses on perception/ action has been stated by Russel & Norvig (1995). An agent is something that perceives and acts. Other definitions of agent that focus on the ability to learn, i.e. the ability of the system to change its behavior as a result of the outcome of previous actions. None of these definitions say anything about what environment the agent exists in. It might be a robot in a physical environment, a living organism in a natural environment or a computer program in a virtual world. For the purpose of this paper, however, the term agent will be restricted to mean a situated and physically embodied mobile system acting in the real world. By situatedness we mean that the agent is interacting with its unconstrained environment in real-time. By embodiment we mean that the agent has a body and interacts with the world by using its body for sensation and action. Or as stated by Brooks (1991c): Situatedness: The robots are situated in the world - they do not deal with abstract descriptions, but with the here and now of the environment that directly influences the behavior of the system. Embodiment: The robots have bodies and experience the world directly - their actions are part of a dynamic with the world, and the actions have immediate feedback on the robots own sensation Definition of autonomy: One definition of the word stated by Russel & Norvig (1995) is: A system is autonomous to the extent that its behavior is determined by its own experience. For the purpose of this paper autonomy will be defined slightly different - as the ability of a system to perform complex tasks without human guidance while coping with an unknown and changing environment. Being autonomous will put a number of requirements on the control system of an agent. These will be discussed in this paper. One aspect of autonomy that will not be discussed in this paper, however, is energy autonomy. For a robot to be truly autonomous it should be carrying its own energy source and maybe even be able to renew the energy source in order to fulfill its task without human intervention. Definition of robot: The term robot will in this paper be used interchangeably with artificial autonomous agent. Definition of behavior: The behavior of a system is the interactions between the system and its environment. With this definition a behavior is something externally visible. Some authors prefer also to talk about underlaying internal mechanisms as behaviors. In this paper such mechanisms will instead be looked upon as schemas which will be introduced in section 4.7. Definition of control system: A control system of an agent is what produces its behavior-repertoire. Definition of architecture: A traditional definition of architecture as stated by Shaw & Garlan (1996) 2 of 16 Control System Architectures for Autonomous Agents

Classification The architecture of a software system defines that system in terms of computational components and interactions among those components. architectural design is often called SMPA (Sense-Model- Plan-Act) according to the major functional decomposition. For the purpose of this paper the definition by Russel & Norvig (1995) will be used The architecture of a robot defines how the job of generating actions from percepts is organized. world sense model This definition focuses behaviors and how behaviors are generated and is therefore more appropriate. act plan The architecture is a drawing or an abstraction of the control system while the control system is a realization of the architecture. This paper will discuss software aspects of the architectures only. 3.0 Classification Several ways of classifying architectures, based on distribution, knowledge representation or hierarchical organization are possible. The most important way of classifying architectures is in deliberative, reactive and hybrid ones. This and some other important classifications and central properties of an architecture will be discussed in this chapter. 3.1 Deliberative/Reactive 3.1.1 Deliberative One strategy for building control systems architectures is based on traditional artificial intelligence techniques based on global symbolic representations of goals, plans, knowledge etc. The term production system, referring to Nilsson (1982), is used for the three major elements of an AI system: A global database that incorporates all knowledge of the system A set of production rules (operations) that operate on the global database A control system with some global control strategy A control system architecture built around these three elements will be called a deliberative one. This strategy for FIGURE 1. The SMPA cycle Of the architectures outlined in this paper SOAR (section 4.1) is a good example of a deliberative architecture using a typical production system approach. The Blackboard architecture (section 4.2) is also built around many of the same ideas. One architectural principle that is often incorporated in a deliberative architecture is a hierarchical structure. NAS- REM/RCS (section 4.5) is a good example of a hierarchical SMPA architecture. 3.1.2 Reactive The idea behind the reactive architectures is that the modules that the system is built up of are behavior producing instead of functional as in deliberative architectures. Brooks (1991b) describes the duality between reactive and deliberative architectures. The modules of a reactive architecture are behaviors (layers) like avoid obstacles, identify object, explore the environment etc. while the functions of the system (planning, learning, perception etc.) emerges from the interaction of the modules and the environment. The modules of a deliberative architecture on comparison are functional blocks like planning, learning or perception blocks while the behaviors (avoid obstacles, identify object, explore the environment) emerge from the interaction of the modules. Reactive architectures are also called behavior-based architectures when they include a little bit more complicated behaviors than immediate reaction on outer stimuli. These ideas as a model for intelligence and as a design principle for autonomous agents are often referred to as the New AI paradigm according to Pfeifer (1995). The key idea of New AI is that better understanding of intel- Control System Architectures for Autonomous Agents 3 of 16

Classification ligence will emerge only by building complete autonomous agents. One of the earliest prototypical works in reactive strategy robotics was done by Walter already in the early 50 s. However it is not until the 80 s that different researchers started using reactive strategies as an alternative to deliberative strategies in order to build autonomous agents. The most well known being the Subsumption architecture described in section 4.6. Purely reactive architectures have been criticized for being to rigid and simple to produce intelligent behavior like planning or learning. Brooks (1991a) argues that no type of central, manipulative or symbolic representation will be necessary to produce successful behavior, not even to produce intelligent behavior or learning. The claim that reactive models of intelligence would scale to human-like problems, however, are not undisputed. See e.g. Tsotsos (1995). 3.1.3 Hybrid Hybrid architectures are integrating a deliberative component for planning and problem solving with a reactive component for real time control. Hybrid architectures may thus be viewed as being something between reactive and deliberative ones by combining aspects of both. Even better, however, hybrid architectures may be viewed as opposed to both reactive and deliberative ones. According to the hybrid approach no single strategy will be adequate for all tasks relevant to a robotic system whereas the deliberative as well as the reactive approach relies on one single technique. The first architecture to be presented encompassing aspects of both deliberative and reactive control was AuRA described in section 4.7. Ollero et al. (1995) uses a slightly different way of building a hybrid control system in the Navigation and Operation System (NOS). NOS is a hierarchical architecture where each subsystem is built as a behavior-based control system. A Supervisor Sequential Controller is coordinating the overall behavior in an event driven fashion. 3.2 Hierarchical organization There are several ways of organizing hierarchies: Temporal hierarchy Data abstraction hierarchy Representational abstraction hierarchy Subsystem hierarchy A temporal hierarchy is based on the principle that tasks (real-time processes or schemas) in lower levels run at a higher frequency than tasks in higher levels. Correspondingly tasks in higher levels usually have a longer temporal extent. A data abstraction hierarchy provides the higher levels with an abstraction of data in an object-oriented fashion while representational abstractions are used to suppress or ignore information that are irrelevant on higher levels. Subsystem hierarchies are based on structural grouping of subsystems in a hierarchical fashion. Many of the architectures discussed in this paper combine different hierarchical organizational principles. The layered structure of architectures like the Subsumption architecture could be described as a hierarchical structure. However, since this structure is fundamentally different from all the hierarchies described above the word layered structure is preferred. 3.3 Representation Some form of knowledge can be used to represent the environment of the agent. The representation used for storing this knowledge can be of different forms: central or distributed homogeneous or heterogeneous uniform or non-uniform explicit or implicit symbolic or descriptive Distributing the knowledge will usually increase flexibility and fault tolerance of the system at the expense of more need of communication between modules. If knowledge is uniformly represented all types of knowledge is stored in the same format. An advantage of this is that when modules are added or changed, the interfaces of others do not have to be modified. A disadvantage is that 4 of 16 Control System Architectures for Autonomous Agents

Classification the representation might be inefficient and will limit the possibilities of knowledge stored. Knowledge can be homogeneous so that all modules use the same information. An advantage of this is lower risk of inappropriate behaviors due to different modules using contradictory information. The disadvantage is the cost of coordinating different knowledge sources and keeping the knowledge consistent. An advantage of using uniform and homogenous knowledge is that different modules may share their data and abilities for more intelligent combined behavior. In classical AI global symbolic representations have played an important role. The main principles of any deliberative strategy is outlined in chapter section 4.1 and especially the Physical Symbol System Hypothesis points out the importance of symbolic representation. Both autonomous agent researchers and cognitive science researchers have disputed the importance of symbolic representations. The world is its own best model. [Brooks (1991b)] 3.4 Learning In order for an agent to be able to adapt its behaviors to a changing environment or just to improve its performance over time its control system needs to address the learning problem in some way. The most often used method is Reinforcement Learning (RL). Another learning method that is more focused on using knowledge to make explanations and generalizations is called explanation based learning (EBL). 3.4.1 Reinforcement Learning The basic idea of RL is to let a system use the observed state of the world together with some internal state of the system to choose an action to execute. Then the result of the action is evaluated and the internal state is updated so that a satisfactory result will strengthen the tendency of this particular action. Reinforcement learning RL has its origins in studies in animal psychology and the use of RL techniques in computer science was introduced by Minsky (1961). One problem with RL is that the reinforcement of an action can be arbitrarily delayed. One method that addresses this problem is the so called Q-learning algorithm described by Watkins (1989). Mataric (1991) notes that another problem with RL is that the algorithms get complex and are slow to converge. An algorithm that uses genetic algorithms to achieve reinforcement learning is the Bucket Brigade algorithm introduced by Holland (1985). Another publication that discusses RL is Kaelbling (1990). 3.4.2 Genetic algorithms A traditional control system architecture that is fixed by a human designer will, to some extent, rely upon the designers predictions of what behaviors will be needed. Beer et al. (1990), Cliff et al. (1993) and other authors have argued that such architectures never will be efficient in an unknown environment that is impossible to predict. They thus advocate a designing procedure that does not rely on fixing an architecture but on using evolutionary mechanisms to let the architecture evolve and adapt to the environment in which it lives [Kodjabachia & Meyer (1995)]. Genetic algorithms (GA) were initially developed by Holland (1975). GA s have been used in development of autonomous robots as described by Nolfi et al. (1994) and are often used in combination with ANNs [Parisi & Nolfi (1993)]. The use of GA as an unsupervised learning method for reactive control architectures is described by Ram et al. (1994). Here a GA algorithm is used in the AuRA architecture (section 4.7) to decide the values of the robots reactive control schema parameters. 3.4.3 Fuzzy Logic The use of Fuzzy Control is a rather well established technique for control purposes as overviewed in Yager & Zadeh (1992). However, fuzzy techniques have been used even for purposes more specifically related to autonomous agents like learning or obstacle avoidance Reignier (1994). The DAMN architecture (section 4.9) has been implemented in a Fuzzy Logic framework. 3.5 Execution Two complementary principles for execution in real-time systems are event driven and time driven execution. Con- Control System Architectures for Autonomous Agents 5 of 16

Architectures trol systems of autonomous agents that need both fast response on outer stimuli and continous control of actuators will need to include both principles in its design. Another important aspect of the execution of a control system is the amount of parallelism that it uses and how the parallelism is achieved. A distributed architecture can be built up of independent parallel processes or by some form of coordination of semi-parallel modules. 3.5.1 Discrete Event Systems Discrete Event Systems (DES) approaches have been thoroughly used in robotics. One principle that has proved useful for modeling control systems of autonomous agents and that has been used by many researchers is the Petri Net approach. See e.g. Causse & Christensen (1995). Ramadge & Wonham (1989) have developed a framework for modeling DES. This has been used by e.g. Kosecka & Bajcsy (1994) for modeling reactive behaviors for autonomous robots. The agent was modeled as a finite state automata with actions treated as events (transitions between states). A DES can be modeled as a cellular automaton which is described by Gutowitz (1991). One of the central properties of cellular automata is the property of self-organization. Nehmzow (1996) argues that self-organization offers a means of data processing highly suitable for robot control. 4.0 Architectures This chapter will give simplified descriptions of some control systems that have been implemented for autonomous agent control. The aim is to show some of the more well known architectures and to give examples of some different approaches that have been used. The criteria for choosing architectures have been to include those architectures that are most frequently referred to and some that are less well known in order to cover a broad range of different approaches. It should be noted that the scope and the aim of these architectures differ a lot. The aim of some of the architectures are very general while others are more restricted, so the architectures cannot really be compared as different solutions to the same problem. 4.1 SOAR 4.1.1 Ideas The SOAR (State, Operator And Result) architecture described by Rosenbloom et al. (1993) has been implemented at Carnegie Mellon University (CMU) as a testbed for the theories of intelligence introduced by Newell (1990). He defines intelligence as a systems ability to use its knowledge to achieve its goals. Focus on knowledge has been the basis for the architectural design. In Laird et al. (1987) eleven hypotheses were proposed for the architecture of a general intelligence. Physical Symbol System Hypothesis: A general intelligence must be realized with a symbol system. Goal Structure Hypothesis: Control in a general intelligence is maintained by a symbolic goal system. Uniform Elementary-Representation Hypothesis: There is a single, elementary, uniform representation for declarative knowledge. Problem Space Hypothesis: Problem spaces are the fundamental organizational unit of all goal-directed behavior. Production System Hypothesis: Production systems are the appropriate organization for encoding all long-term knowledge. Universal-subgoaling Hypothesis: Any decision can be an object of goal-oriented attention. Automatic-subgoaling hypothesis: All goals arise dynamically in response to impasses and are generated automatically by the architecture. Control-knowledge Hypothesis: Any decision can be controlled by indefinite amounts of knowledge, both domain dependent and independent. Weak-method Hypothesis: The weak-methods form the basic methods of intelligence. Weak-method Emergence Hypothesis: The weak methods arise directly from the system response, based on its knowledge of the task. Uniform-learning Hypothesis: Goal-based chunking is the general learning mechanism. These hypotheses, and especially the first two, are the foundations of all traditional AI systems and SOAR combined these in its design. 6 of 16 Control System Architectures for Autonomous Agents

Architectures 4.1.2 Implementation SOAR is built upon symbolic representation stored in a single, non-modular Production System (PS) (see section 3.0) and uses single structure, single access method and single learning method for all types of knowledge. In addition to the PS, SOAR includes a Working Memory (WM) that holds the complete processing state and preferences. Perception and action influence the WM and a decision procedure uses the information in the WM. New productions are added to the PS by a chunking mechanism through which all learning in the system occurs. Information in the WM is handled by a WM-manager. perception action PS WM decision procedure chunking mechanism WM manager 4.2.2 Implementation A blackboard is a central data store within the system that is used by a number of independent knowledge sources. The knowledge sources (KS) are subsystems like for example vision system, sonar system, obstacle avoidance system or planning system. The key principle of the blackboard architecture is that all communication between subsystems are handled via the blackboard. All the knowledge systems are independent. The blackboard (BB) is a storage of knowledge accessible to all of the knowledge systems. The entries in the blackboard can be pieces of data, requests from a KS or some partial results. Different kinds of knowledge representations will be integrated in the blackboard. Some implementations of the blackboard have it divided into subblackboards [Pang & Shen (1990)]. In order to activate the different KS there is a control unit (CU) or scheduler. FIGURE 2. The mechanisms of SOAR User Interface SOAR is supposed to be an architecture capable of general intelligence not specifically designed for autonomous robots. Laird et al. (1989) have described an extension of the SOAR architecture, Robo-SOAR, to be used in robotics. CU KS 1 KS 2 : KS n BB 4.1.3 Properties SOAR is a deliberative architecture built around global homogeneous and uniform representation. Learning is done by symbolic world models. A temporal hierarchical organization has been incorporated. 4.2 Blackboard Architecture 4.2.1 Ideas Blackboard architectures are based on the idea of using distributed sensing, acting, and reasoning modules communicating via a common memory. This approach should make the system modular and facilitate parallel design of modules. FIGURE 3. A simplified illustration of a blackboard architecture One early example of a blackboard architecture was presented by Hayes-Roth (1985). It has been further developed in the AIS (Adaptive Intelligent Systems) project at Stanford University. 4.2.3 Properties Knowledge is centralized, homogenous but non-uniform. The architecture is non-hierarchical and modular. It could be classified as a deliberative architecture even though it does not use a traditional production system approach. Control System Architectures for Autonomous Agents 7 of 16

Architectures 4.3 Prodigy 4.3.1 Ideas The Prodigy architecture, developed by Minton et. al (1989), is an architecture for planning and learning. 4.3.2 Implementation The architecture is composed of a number of functional modules, such as the problem solver, the knowledge database, and the learning modules. 4.5 NASREM/RCS 1 4.5.1 Ideas Albus (1991) presents an architectural framework to describe intelligent systems, both natural and artificial. He defines intelligence as the ability of a system to act appropriately in an uncertain environment. Prodigy incorporates EBL and other learning mechanisms. See Carbonell et al. (1992) for details about the architecture. SP 3 VJ 3 WM 3 BG 3 4.3.3 Properties Prodigy is a deliberative architecture based on symbolic knowledge representation. It is an architecture for general problem solving concentrating on learning. Knowledge representation is homogenous and uniform which allows for additional learning or problem-solving modules to be easily added and gives Prodigy a modular design. 4.4 THEO FIGURE 4. SP 2 VJ 2 WM 2 SP 1 VJ 1 WM 1 Sensors ENVIRONMENT BG 2 BG 1 Actuators A hierarchy with the elements of intelligence according to Albus (1991). 4.4.1 Ideas THEO, developed by Mitchell et al. (1989), is a general problem solver. 4.4.2 Implementation THEO stores all of its data in a huge knowledge database. Data is organized symbolically using entities with a uniform representation that allows all knowledge to be accessed and manipulated. A reactive controller has been added to THEO in order to build a control system for autonomous agents as described by Mitchell (1990). The reactive controller controls the robot but when it fails to suggest a course of action THEO creates a plan for the robot. 4.4.3 Properties THEO is a hybrid architecture built on a centralized structure with focus on learning and reasoning using homogeneous and uniform centralized knowledge representation. Albus' model is hierarchical where control bandwidth and perceptual resolution decrease while goals, planning horizons and world models expand in space and time about an order of magnitude for every hierarchical level. At each level he identifies four elements of intelligence : Behavior Generation (BG) that selects goals, and plans and executes tasks. Tasks are recursively decomposed into subtasks. World Model (WM) that contains the systems best estimate of the state of the world. The world model also contains simulation capability that generates expectations and predictions. Sensory Processing (SP) for active perception based on sensory input and predictions from the world model and updates the world model accordingly. Value Judgement (VJ) that calculates costs and risks and evaluates both the observed state of the world and the predicted results of hypothesized plans. 1. NASA Standard Reference Model for Telerobot Control System Architectures/Real-time Control System 8 of 16 Control System Architectures for Autonomous Agents

Architectures 4.5.2 Implementation NASREM/RCS as described in Albus et al. (1989), is a rather straightforward implementation of these ideas with six hierarchical levels. There is however one exception. In NASREM/RCS no value judgement functionality is included since NASREM/RCS is primarily used in teleoperated systems where the operator interface replaces the value judgement. GOAL asynchronous module that deals with its own perception and action control. Higher layers can suppress the output of lower layers but the lower layers continue to function as higher layers are added. 4.6.2 Implementation Brooks himself describes the approach as a vertical decomposition of the control system into task achieving behaviors compared to the traditional horizontal senseplan-act type of decomposition into functional modules. GLOBAL MEMORY SP SENSE WM BG ACT Mission SP WM BG Service SP WM BG Task SP WM BG E-move SP WM BG Primitive SP WM BG Servo OPERATOR INTERFACE sensors sensors motor control task execution planning modeling perception identify objects build maps explore avoid objects locomote actuators actuators FIGURE 5. Implementation of NASREM RCS NASREM/RCS has been successfully used in a number of teleoperated and semi-autonomous vehicles in space and undersea applications. Henderson (1990) presents one example for a two armed robot with seven degrees of freedom in each arm. The robot is teleoperated and has been used to build and maintain the NASA Space Station. 4.5.3 Properties It is a hierarchical architecture that uses centralized and homogenous but non-uniform knowledge representation and a sense-model-plan-act structure for each hierarchical level. The hierarchical structure is both a temporal hierarchy and a representational abstraction hierarchy. 4.6 Subsumption Architecture 4.6.1 Ideas The Subsumption Architecture, first proposed by Brooks (1986) is a layered architecture where each layer is an FIGURE 6. Decomposition into functional modules (top) compared to decomposition into task achieving behaviors (bottom). The architecture can be partitioned at any level and the layers below form a complete operational control architecture in itself. Each layer connects sensing to action and constitutes a task achieving behavior generator in its own right. Layers are added incrementally. Higher layers may implicitly depend on earlier layers operating successfully, but do not call them explicitly as subroutines. Conflicts between behaviors are resolved by a fixed priority arbitration scheme. With this architecture there is no place for a central model of the world as each layer extracts only those aspects of the world which they find relevant. Nor is there any place for symbolic representation of goals or plans. The Subsumption architecture has been successfully implemented in a large number of robots described in Brooks (1990). Control System Architectures for Autonomous Agents 9 of 16

Architectures 4.6.3 Properties This is the purest implementations of the reactive strategy being completely decentralized and using no symbolic knowledge representation or learning. Any knowledge is distributed, heterogeneous and non-uniform. 4.7 AuRA 4.7.1 Ideas Autonomous Robot Architecture (AuRA) was introduced by Arkin (1987). Information on the structure of AuRA and its roots in biology can be found in Arkin & Balch (1994). A schema theoretic approach was used as a basis for AuRA. A schema is that which produces some behavior and is both a store of knowledge and a description of a process for applying that knowledge. It may be recursively defined and is independent of its implementation. The concept of schemas has been used in both cognitive psychology, neuro-psychology, and artificial intelligence. Motor - Interface to the specific robot to be controlled. Homeostatic - Monitoring internal conditions for dynamic replanning. The Cartographic subsystem consists of a priori models of the world stored in long term memory, dynamically acquired world knowledge stored in short-term memory and models of spatial uncertainty. The Planning subsystem consists of a deliberative component and a reactive component. The deliberative component is a hierarchical planner with three hierarchical levels the Mission Planner, the Spatial Reasoner and the Plan Sequencer. The Spatial Reasoner uses the Cartographic world knowledge to choose a number of path legs which the Plan Sequencer translates to a number of motor schemas. The reactive component is the motor schema controller. Human Interface For the purpose of autonomous agent control the following definition of schema by Arkin (1990) is adequate: The primitives that serve as the basic building blocks of perceptual and motor activity. Cartographic Mission Planner Spatial Reasoner Plan Sequencer Schema Controller Homeostatic A compilation of the schema theoretic approach to behavior-based robotics is found in Beer et al. (1993). Three different kinds of schemas are defined: Perception Sensors Motor Actuators Motor schemas - the basic unit of motor behavior for a robot encoding the way a robot responds to environmental stimuli. Perceptual schemas - which embody the action-oriented perception paradigm and deliver perceptual information to the motor schemas. Signal schemas - for monitoring the internal state of the robot and modifying the robots behavior accordingly. 4.7.2 Implementation AuRA consist of 5 major subsystems: Perception - Collection and filtering all sensory data. Cartographic - World knowledge. Planning - A hierarchical planner and the motor schema controller. FIGURE 7. A simplified illustration of AuRA 4.7.3 Properties AuRA is a hybrid architecture that incorporates features both from the reactive, deliberative, and hierarchical architectures. Modularity, robust performance and ability to learn is mentioned as the advantages of this architecture and schema-based systems in general. 10 of 16 Control System Architectures for Autonomous Agents

Architectures 4.8 ATLANTIS 1 4.8.1 Ideas The architecture can be viewed as a concrete implementation the plans-as-communications theory described by Agre & Chapman (1990). Here they contrast the plan-ascommunication view to the plan-as-program view for an autonomous agent. By the plan-as-program view a plan is built from a number of primitives and operators and is executed as a procedure. They believe this is inadequate for an agent situated in an unpredictable world. The plan-ascommunication view uses plans for an autonomous agent in a less mechanical and exact but more goal directed manner, much more like humans use plans in an everyday sense. 4.8.2 Implementation The Atlantis architecture was introduced by Gat (1992) at NASA in cooperation with CalTech. Atlantis incorporates three layers: a Lisp-based deliberator, a sequencer, and a reactive controller. The Deliberative Layer responds to requests from the sequencing layer to perform deliberative computations. The Sequencing Layer has a higher-level view of robotic goals than the control layer. It tells the control layer below it when to start and stop actions and handles failures of the control layer. The Control Layer directly reads sensors and sends reactive commands to the effectors based on the readings. The stimulus-response mapping is given to it by the sequencing layer. 4.8.3 Properties Atlantis is a hybrid architecture and knowledge representations is distributed and heterogeneous. The structure is a representational abstraction hierarchy with three layers. 4.9.2 Implementation The architecture consists of a centralized arbiter and a number of distributed action-producing modules. The action-producing modules are independent processes that can be representing either reactive or deliberative type of behaviors. The arbiter receives votes for or against commands from each module and decides upon the course of action depending on the received votes. The arbiter performs command-fusion (behavior-fusion) to select the most appropriate action. The architecture has been successfully used for road following by an autonomously driving car. 4.9.3 Properties DAMN is a hybrid architecture and knowledge representations is distributed and heterogeneous. The structure is distributed with a centralized arbiter. 4.10 Beer s Artificial Insect Architecture 4.10.1 Ideas Animals can be seen as autonomous agents with their nervous systems being the most complex and successful control systems we know. Beer et al. (1990) have studied animals and used an ethological approach in the so called Artificial Insect Project. The robot is inspired by and has a number of characteristics in common with insects (e.g. the American Cockroach). 4.10.2 Implementation Beer et al. (1992) presents a fully distributed neural network architecture for the locomotion of a hexapod robot. The architecture is built up of six controller circuits - one for each leg according to Figure 8. Adjacent leg controllers mutually inhibit each other. 4.9 DAMN 4.9.1 Ideas The Distributed Architecture for Mobile Navigation (DAMN) was developed by Rosenblatt & Thorpe (1995). 1. A Three-Layer Architecture for Navigating Through Intricate Situations Control System Architectures for Autonomous Agents 11 of 16

Discussion Backward Swing Foot Forward Swing Excitatory Connection Command Pacemaker Backward Angle Sensor Forward Angle Sensor Inhibitory Connection 4.11.3 Properties This is an ANN based architecture that incorporates RL techniques as described by Ilg & Berns (1995). The structure is hierarchical with three layers. 5.0 Discussion The different types of architectures and specific architectures described earlier will be compared and discussed with respect to a number of important characteristics. FIGURE 8. Leg control cirquit Both in simulations and in an implemented robot they have shown that this simple architecture was capable of generating a continous range of statically stable gaits, as well as smooth transitions between these gaits. 4.10.3 Properties No representation or learning has been included and the architecture should be viewed as a purely reactive one. No explicit knowledge is used. The architecture, however, only addresses the sequencing of leg movements and has thus a much more limited scope than the other architectures discussed. 4.11 LAURON Architecture 4.11.1 Ideas The LAURON machine is a six-legged walking machine developed at the University of Karlsruhe. It has a control architecture which uses both recurrent and feedforward neural networks as basic components. The architecture was proposed by Berns et al. (1995). 4.11.2 Implementation The control system is hierarchically organized and consists of three layers. The Reactive Element that evaluates sensory input and selects the best leg coordination. The Leg Coordination that generates gait patterns. The Leg Controllers which, one for each leg, controls the leg movement. 5.1 Modularity One of the reasons for having a modular architecture is to facilitate adding of new functionality. This gives a flexible robot that can be easily adapted to different applications and different environments. The use of uniform knowledge representation can facilitate the design of independent modules as in e.g. Prodigy (section 4.3). There are different ways of obtaining modularity as seen from the architectures described earlier. The Blackboard Architecture (section 4.2) focuses on independent modules while the Subsumption Architecture (section 4.6) focuses on independent behaviors. A problem with reactive architectures with respect to modularity is their inherent unpredictability. Even if it is possible to add new behaviors in a modular way it can be impossible to predict the overall performance of the robot when behaviors interact. 5.2 Robustness An agent should be able to continue to function during unexpected situations and the functionality should degrade smoothly due to events that are impossible for the agent to cope with. Reactive and hybrid architectures are generally supposed to be more robust than deliberative ones. 5.3 Fault tolerance For many applications autonomous robots should be capable of performing tasks in environments that are dangerous or unsuitable for humans (e.g. space or undersea applications). In order for a robot to fulfil its goal in such environ- 12 of 16 Control System Architectures for Autonomous Agents

Discussion ments despite component failures it must be able to continue to function without any possibility of repair. The robot must therefore be able to monitor, detect and compensate for component failures. This important property of an autonomous agent has been addressed by e.g. Ferrell (1994). It is usually difficult to get fault tolerant performance from a system with centralized structure and uniform knowledge-representation. 5.4 Distribution Related to the fault tolerance of an architecture is its distribution. For a centralized architecture like THEO (section 4.4) there is a problem that the central parts becomes a bottleneck that slows down the control system. For a distributed architecture on the other hand the problem is communication and coordination that can be difficult. The performance of distributed systems can also often be harder to predict while a centralized system usually is less robust and fault tolerant. A distributed architecture like DAMN (section 4.9) is tolerant to faults in the distributed modules. There is, however, usually a need for some form of centralized coordination or arbitration that can be a bottleneck and that will be critical for the system. 5.5 Reactivity In order for a robot to run in a real unknown world one of the most important properties is its ability to act in short and predictable time on any stimulus. This can often be a problem with deliberative architectures that need to do some type of modeling before acting on a stimulus. One of the main advantages of reactive architectures over deliberative ones are their processing speed. Processing speed will often be a problem also for hierarchical architectures like NASREM/RCS (section 4.5) because of the communication delays over many hierarchical levels. 5.6 Adaptability Since the control architecture will need to be adapted and extended during the lifetime of the robot, making adaptive control architectures is thus an important research topic. A method of dynamic switching between alternative control loops has been described by Lueth et al. (1995). One of the most important aspects of the adaptability of an agent is its ability to adapt to changes in the environment by dynamic learning. In order for an architecture to qualify as a learning architecture according to Plaza et al. (1993) it must have flexible integration of problem solving and learning in the control system. They mention SOAR (section 4.1), THEO (section 4.4), Prodigy (section 4.3) and a few others as learning architectures. 5.7 Planning One form of planning ability that is often used is to make the robot simulate itself and its environment in real-time. This has been viewed as a simplified form of consciousness for the robot. According to Arbib (1995) one common conception of planning is that it involves: 1. A goal. 2. An internal model of the agents external world. 3. A search mechanism that exercises the mental model to generate and evaluate hypothetical action sequences. This seems to imply that no form of planning would be possible in a purely reactive architecture. It has, however, been argued by Agre & Chapman (1990) that it is possible for a reactive system to make plans and by Maes (1990) that it can have goals. 5.8 Cooperation One way of increasing both modularity, distribution, and robustness is to use cooperating agents. A number of agents cooperating to achieve some task that would be impossible for a single agent to achieve. Cooperation can be done by some form of communication between the agents. Communication of goals, internal states or plans are some possibilities but cooperation could also be an emergent feature of selfish agents acting in the same environment with no explicit communication. None of the architectures described here are specifically designed for cooperation but e.g. AuRA has been used for research on cooperating robots as described in Balch & Arkin (1994). Control System Architectures for Autonomous Agents 13 of 16

Conclusion 6.0 Conclusion There are many ways of designing control systems for autonomous agents and no single architecture could be said to be the best for every purpose. Just as we would not expect to find a general all purpose floor-plan for houses we could not expect to find an all-purpose software architecture for robots. Just as elephants are not made to play chess [Brooks (1990)] not all robots should have an architecture suited for abstract reasoning or should be able to adapt to any environment. It seems important to decide on a purpose for an agent in order to evaluate the different properties of the agent and from that derive an architecture that could fulfil the purpose. However, some architectural principles, as discussed in this paper, can be expected to be generally valid: Deliberative architectures are best suited for purposes where planning and long-term reasoning is essential. Reactive architectures are best suited for purposes where the environment is continuously changing and quick responses to outer stimuli are essential. For an unknown and changing environment it seems likely that no single technique architecture, neither purely reactive nor purely deliberative, will able to cope with all different problems. A successful architecture probably needs to be some type of hybrid one incorporating different aspects of immediate response to outer stimuli, some form of representation, planning and learning. 7.0 Acknowledgments This research was done within the Centre for Autonomous Systems (CAS) at Royal Institute of Technology (KTH). CAS is sponsored by the Strategic Research Foundation and I want to thank the sponsors for making this research project possible. I also want to thank my supervisors and colleagues within CAS for help with useful ideas and comments on this paper. 8.0 References Agre & Chapman (1990) P.E. Agre, D. Chapman, What are plans for?, Robotics and Autonomous Systems 6, 1990. Albus (1991) J.S. Albus, Outline for a Theory of Intelligence, IEEE Transactions on Systems, Man and Cybernetics, May/ June 1991. Albus et al. (1989) J.S. Albus, H.G. McCain, R. Lumia, NASA/NBS Standard Reference Model for Telerobot Control System Architecture. Arbib (1995) M. Arbib, The Handbook of Brain Theory and Neuronal Networks, MIT Press 1995. Arkin (1987) R.C. Arkin, Motor schema based navigation for a mobile robot: an approach to programming to behavior, IEEE Int. Conf. on Robotics and Automation, March 1987. Arkin (1990) R.C Arkin, Integrating Behavioral, Perceptual, and World Knowledge in Reactive Navigation, Robotics and Autonomous Systems 6, 1990. Arkin & Balch (1994) R.C. Arkin, T. Balch, AuRA: Principles and Practice in Reviews. Balch & Arkin (1994) T. Balch, R.C. Arkin, Communication in reactive multiagent robotic systems, Autonomous Robots 1, 1994 Beer et al. (1990) R.D. Beer, H.J. Chiel, L.S. Sterling, A Biological Perspective on Autonomous Agent Design, Robotics and Autonomous Systems 6, 1990. Beer et al. (1992) R.D. Beer, H.J. Chiel, R.D. Quinn, K.S. Espenschied, P. Larsson, A Distributed Neural Network Architecture for Hexapod Robot Locomotion, Neural Computation 4, 1992 Beer et al. (1993) R.D. Beer, R.E. Ritzmann, T.M. McKenna, Biological Neural Networks in Invertebrate Neuroethology and Robotics, Boston Academic Press 1993. Berns et al. (1995) K. Berns, R. Dillman, S. Piekenbrock, Neural networks for the control of a six-legged walking machine, Robotics and Autonomous Systems 14, 1995. Brooks (1986) R.A. Brooks, A Robust Layered Control System for a 14 of 16 Control System Architectures for Autonomous Agents

References Mobile Robot., IEEE, Journal of Robotics and Automation, March 1986. Brooks (1990) R.A. Brooks, Elephants Don-t Play Chess, Robotics and Autonomous Systems 6, 1990 Brooks (1991a) R.A. Brooks, Intelligence without representation, Artificial Intelligence 47, 1991 Brooks (1991b) R.A. Brooks, Intelligence without reason, IJCAI Proceedings, 1991 Brooks (1991c) R.A. Brooks, New approaches to robotics, Science 253, 1991 Carbonell et al. (1992) J. Carbonell et al., Prodigy4.0: The Manual and Tutorial, Technical Report CMU 1992 Causse & Christensen (1995) O. Causse, H.I. Christensen, Hierarchical Control Design based on Petri Net Modeling for an Autonomous Mobile Robot, Intelligent Autonomous Systems, IOS Press 1995 Chatila (1995) R. Chatila, Deliberation and reactivity in autonomous mobile robots, Robotics and Autonomous Systems 16, 1995 Cliff et al. (1993) D. Cliff, I. Harvey and P. Husbands, Explorations in evolutionary robotics, Adaptive Behavior 2, 1993 Ferrell (1994) C. Ferrell, Failure Recognition and Fault Tolerance of an Autonomous Robot, Adaptive Behavior 2, 1994 Gat (1992) E. Gat, Integrating planning and reacting in a heterogeneous asynchronous architecture for controlling realworld mobile robots, AAAI-92 Proceedings, AAAI Press, 1992 Gutowitz (1991) H.A. Gutowitz, Cellular Automata: Theory and Experiment, MIT Press, 1991 Hayes-Roth (1985) B. Hayes-Roth, A blackboard architecture for control, Artificial Intelligence 26, 1985 Henderson (1990) T.C. Henderson, Traditional and Non-Traditional Robotic Sensors, Springer Verlag 1990. Holland (1975) J.H. Holland, Adaption in Natural and Artificial Systems, University of Michigan Press 1975. Holland (1985) J.H. Holland, Properties of the bucket brigade algorithm, International Conference on Genetic Algorithms and their Applications Proceedings, 1985. Ilg & Berns (1995) W. Ilg, K. Berns, A learning architecture based on reinforcement learning for adaptive control of the walking machine LAURON, Robotics and Autonomous Systems 15, 1995. Kaelbling (1990) L.P. Kaelbling, Learning in Embedded Systems, Ph.D. thesis Stanford University 1990. Kodjabachia & Meyer (1995) J. Kodjabachian, J.A. Meyer, Evolution and development of control architectures in animats, Robotics and Autonomous Systems 16, 1995. Kosecka & Bajcsy (1994) J. Kosecka, R. Bajcsy, Discrete Event Systems for autonomous mobile agents, Robotics and Autonomous Systems 12, 1994 Laird et al. (1987) J. Laird, A. Newell., P. Rosenbloom, Soar: an architecture for general intelligence. Artificial Intelligence 33, 1987. Laird et al. (1989) J.E. Laird, E.S. Yager, C.M. Tuck, M. Hucka, Learning in tele-autonomous systems using Soar, NASA Conf. on Space Robotics Proceedings, 1989 Lueth et al. (1995) T. Lueth, T. Laengle, J. Heinzman, Dynamic Task Mapping for Real-time Controller of Distributed Cooperative Robot Systems, Distributed Cooperative Robot Systems, 1995. Maes (1990) P. Maes, Situated Agents Can Have Goals, Robotics and Autonomous Systems 6, 1990 Mataric (1991) M.J. Mataric, A Comparative Analysis of Reinforcement Learning Methods, MIT AI lab report 1991. Mataric (1992) M. J. Mataric, Integration of Representation Into Goal- Driven Behavior-Based Robots, IEEE Transactions on Robotics and Automation, Vol 8, No. 3, June 1992 Minsky (1961) M.L. Minsky, Steps toward artificial intelligence, Proceedings of the Institute of Radio Engineers 49, 1961. Minton et. al (1989) S. Minton, J.G. Carbonell, C.A. Knoblock, D.R. Koukka, O. Etzioni, Y. Gil, Explanation-based Learning: A problem solving perspective. Artificial Intelligence 40, 1989. Control System Architectures for Autonomous Agents 15 of 16