Modular Q-learning based multi-agent cooperation for robot soccer

Size: px
Start display at page:

Download "Modular Q-learning based multi-agent cooperation for robot soccer"

Transcription

1 Robotics and Autonomous Systems 35 (2001) Modular Q-learning based multi-agent cooperation for robot soccer Kui-Hong Park, Yong-Jae Kim, Jong-Hwan Kim Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Kusong-dong, Yusong-gu, Taejon-shi , South Korea Received 8 August 2000; received in revised form 12 February 2001 Communicated by F.C.A. Groen Abstract In a multi-agent system, action selection is important for the cooperation and coordination among agents. As the environment is dynamic and complex, modular Q-learning, which is one of the reinforcement learning schemes, is employed in assigning a proper action to an agent in the multi-agent system. The architecture of modular Q-learning consists of learning modules and a mediator module. The mediator module of the modular Q-learning system selects a proper action for the agent based on the Q-value obtained from each learning module. To obtain better performance, along with the Q-value, the mediator module also considers the state information in the action selection process. A uni-vector field is used for robot navigation. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots Elsevier Science B.V. All rights reserved. Keywords: Multi-agent system; Robot soccer system; Reinforcement learning; Modular Q-learning; Action selection 1. Introduction It is important that multi-agent systems perform tasks that are complex and difficult. This needs cooperation and coordination among the agents [3,9]. Developing a multi-agent system amounts to the search for a method for implementing an intelligent system composed of multi-agents, with independent motion control and cooperation with each other. Multi-agent systems are more flexible and fault tolerant as several simple robot agents are easier to handle and cheaper to build compared to a single Corresponding author. addresses: khpark@vivaldi.kaist.ac.kr, (K.-H. Park), johkim@vivaldi.kaist.ac.kr (J.-H. Kim). powerful robot which can carry out different tasks [7]. From the standpoint of multi-agent systems, robot soccer is a good example of the problems in real world which can be moderately modeled. The soccer game is different from other multi-agent systems in that the robots of one team have to cooperate while facing competition with the opponent team. The cooperative and competitive strategies used play a major role in a robot soccer system [10]. The related research issues are quite wide and they are associated with the hardware configuration, software implementation, agent/robot communication, sensor fusion and learning, to mention a few. The action of the robot is usually selected by considering some conditions in the robot soccer /01/$ see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S (01)

2 110 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) environment [12]. However, it is not possible to describe all the situations of the robot soccer game by some condition statements. Moreover, as the environment under consideration is dynamic and complex, reinforcement learning should be employed for the selection of the proper action. In reality, it is very difficult to get the model of the robot soccer game. The agent learns its own action by reinforcement learning [1,2]. Reinforcement learning is the problem faced by an agent that learns its behavior through trial-and-error interactions with a dynamic environment [5,6]. The agent only knows the possible states and actions, not the transition probabilities or the reward structure [11]. Among the reinforcement learning methods, Q-learning can be used in the reinforcement scheme as it is applicable where no model of the environment is available [8,16]. In this paper, modular Q-learning is applied to improve the performance of the team playing in NaroSot (Nano-Robot World Cup Soccer Tournament) category of FIRA (Federation of International Robot Soccer Association; where five robots of size 4 cm 4cm 5.5 cm form a team. Modular Q-learning is one of the reinforcement learning schemes, where the mediator module selects the proper action of a robot based on the Q-value obtained from each learning module. When selecting the proper action, state information such as the distance between the ball and the robot, and the angle between the robot heading angle and the desired angle is also considered in the mediator module along with the Q-value to improve the learning performance. The concept of coupled agents is proposed to resolve conflicts between robots when the ball is located at the boundary region. A uni-vector field method is used for the navigation of the robot. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots of the team Y2K2 ranked as a runner-up at the FIRA Robot World Cup Brazil 99. Section 2 describes the robot soccer system, the structure of the robot, the uni-vector field for navigation and basic actions, and robot soccer strategies. This is followed by modular Q-learning and implementation of modular Q-learning for robot soccer in Section 3. The experimental results are presented in Section 4 and concluding remarks are given in Section Robot soccer system 2.1. NaroSot robot soccer system The micro-robot soccer system which comprises of robots, overhead vision system and a host computer is being used as a practical test bed to develop multi-agent systems and multiple robot systems. The complexity of the robot soccer system comes from the cooperation with the home team robots, the competition with opponent team robots and the fast and precise control of each robot while tracking the ball which is the passive constituent of the dynamic environment. We will now describe the NaroSot (Nano-Robot World Cup Soccer Tournament) system, which is one of the categories of the FIRA games. In the NaroSot category, each team has five robots of size 4 cm 4cm 5.5 cm. The pitch is 150 cm 90 cm in size and a ping-pong ball is used. Fig. 1 shows NaroSot robots and a ping-pong ball in the playground. Due to the size limitation, encoders are not used and only vision information is used as feedback. Hence, precise and fast robot control is difficult. The host computer receives the vision signals and uses this to compute the strategy routine and command velocity. The command velocity is then sent to the robots. The strategy routine is to select a proper action for each robot considering the game situation. The robot receives the velocity data sent from the host computer through the RF (radio frequency) transmitter and controls the motor velocity using the command data. The developed robots have two centrally aligned wheels which are easier to control. The width D between the two wheels of the robot is 3.5 cm and the radius R of the wheel is 1.0 cm. Each robot is composed Fig. 1. The NaroSot robots.

3 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) The velocity vector S and the posture vector P are associated with robot kinematics as follows: ẋ c cos θ c 0 cos θ c 0 [ ] P υ = ẏ c = sin θ c 0 S = sin θ c 0. ω θ c (3) 2.2. Uni-vector field navigation Fig. 2. Robot modeling. of four parts; a micro-controller part, an RF communication module, two DC motors with motor driving chips and a power supply unit. The micro-controller PIC16C73A is used for processing the command data and for computing the motor control using the two PWM signals. The RF module is used for communication between the host computer and the robot. The motors have a 6:1 gear ratio without encoders. 9.6 V rechargeable cells are used as power supply and a regulator is used for the logic power supply. Two wheeled mobile robots are considered under the assumptions of non-slipping and pure rolling [4]. Its kinematics can be derived using Fig. 2, where X, Y are the global coordinates. Posture P and position p c of the robot are defined as P = x c y c θ c, p c = [ xc y c ], (1) where (x c, y c ) is the position of the robot center, and θ c the heading angle of the robot with respect to global coordinates. Velocity vector S is defined as follows: [ ] υ S = = ω V R + V L 2 V R V L D = D D [ VL V R ], where υ is the translational velocity of the robot center, ω the rotational velocity with respect to the robot center, V L the left wheel velocity and V R the right wheel velocity. The translational velocity and rotational velocity are obtained from the two wheel velocities. (2) Fig. 3 shows the proposed uni-vector field, where the tiny circles with small dash attached to it denote the robot heading direction. The tiny circle is meant to represent the robot position and the straight line attached to it represents its heading directions [13]. A slightly bigger version of the same symbol is used in the figure to represent the initial position of each of the five robots. A vector field (x, y) at position p is defined as F(p) or F(x, y). It is assumed that the magnitude of the vector field is unity and is the same at all points [14]. The vector field at robot position p is generated by F(p) = pg nφ with φ = pr pg, (4) where n is a positive constant. The larger n is, the smaller the F(p) is at the same robot position. Thus, if n increases, the uni-vector field spreads out to a larger area, making the path to be traversed by the robot in reaching its goal larger. The shape of the field and the turning motion of the robots change according to the parameter n and the length of the line gr. The proposed uni-vector field method is based on (4), through which the vector field at all points can be obtained. In Fig. 3, g represents the target position of the robots. A dummy point r is used for deriving the vector field. The dummy point r is selected heuristically close to the goal point g. In practical applications, the point g will be the position of the ball. The following relationships are used to reduce the error in angle between the robots and the field vectors: ω = K P θ e + K D θ e, θ e = F(p) θ c, θ e = dθ (5) e dt, where F(p) is the vector field at position p with unit magnitude, θ e the error in angle between the robot

4 112 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 3. Uni-vector field method. heading and the field vector direction, θ e the derivative of θ e, K P the proportional feedback gain, and K D the time derivative feedback gain The translational velocity υ is constant. If υ = 0, the robot s heading angle will be towards the direction of F(p) without any changes in position. As indicated by (5), the robot motion is controlled through its right and left wheel velocities which are functions of time: V R = V C + K P θ e + K D θ e, V L = V C K P θ e K D θ e, (6) where V C is the constant robot center velocity. The robot s vector field will be oriented towards the target position and the associated angle of the robot motion is as shown in Fig Implementation of modular Q-learning 3.1. Modular Q-learning Q-learning is a recently explored reinforcement learning algorithm that does not need a model for its application and can be used on-line. Q-learning algorithms store the expected reinforcement value associated with each situation action pair, usually in a look-up table. However, in applying Q-learning to the multi-agent system, there are some difficult problems because the dimension of state space for each learning agent grows exponentially, the power being proportional to the number of agents. For example, considering two agents being engaged in a joint task. Here a joint task implies two agents working together to find an optimal method to kick the ball. Assume that for a single agent 10 3 states are needed for learning, then in the case of the joint task, mentioned above, the total number of states of the agent will grow to 10 6.Asactions are needed in every state, it needs more memory space for multi-agent learning. Such an application of Q-learning to the kind of multi-agent learning problems results in the explosion of the state and memory space. To overcome this problem, modular Q-learning is employed. Fig. 4 shows the architecture of modular Q-learning [15]. The architecture consists of learning modules which amount to the number of agents involved in the task and a mediator module. Each agent in the learning module carries out Q-learning in the environment. In the learning module, learning concentrates on a single agent and the learning of other agents is not considered. To complete the global goal, it needs a mediator module to arbitrate the results of the learning modules. The mediator module makes the final decision and selects the most suitable action based on the Q-value

5 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 4. Modular Q-learning architecture. received from each learning modules. In [15], the mediator module makes this selection by considering the highest Q-value received from the learning modules. This selection method is called greatest mass merging strategy. However, in real experimental environment, convergence to the optimal Q-value during finite iteration is not often possible. So, it is desirable to select the most suitable action by considering an appropriate function which is calculated by using the Q-value and the state information. In this paper, the following function is used to make the final decision in the mediator module: arg max a f(q i (s i, a), θ i,d i ), (7) where a is the action of the agent and i the number of learning modules. The Q-value, Q i is obtained from the learning module and θ i is calculated as (90 θ e ). If Robot 1 and Robot 2 were to form a coupled agent, then considering d 1 as the distance between Robot 1 and the ball, and d 2 as the distance between Robot 2 and the ball, d i for Robot 1 is computed as d i = d 2 d 1. It should be noted that θ i and d i are considered in the mediator module to select the final action. In robot soccer game, robots play the role of attackers, defenders and goalie. In the robot soccer system being implemented in the NaroSot category, there are two attackers, two defenders and a goalie. All attackers and defenders have only two actions either to shoot or to follow the uni-vector field. In the uni-vector field, the target point is the position of the ball. The action selection layer, as a coordinator, selects the shoot action when it is in a good position to do so. Under normal conditions, robots follow the uni-vector field. The robot following the uni-vector field selects the shoot action when its longitudinal position is within the boundary of the ball. In shoot action, the target where the robot kicks the ball to is the center of the opponent goal area. The velocity of a robot in shoot action is faster than its velocity while following the uni-vector field. Goalie has its own actions within the goal area for defending a goal. The role allocation layer, as a higher level, selects the role of a robot according to the situation. The implemented robot soccer system uses a relative fixed role allocation scheme. It is a (1 goalie, 2 defenders, and 2 attackers) formation strategy. The term relative fixed role allocation is being used because the zones of each robot though fixed in most cases would, in some specific situations, be changed for a short interval. In the zone defense scheme, there would be a conflict among the home robots near the boundary region. This is classified as a non-blocked situation. On the other hand, a blocked situation is when the home robot is blocked by opponent robots. With respect to the ball position, there are three cases in the non-blocked situation. Fig. 5 shows the three boundary regions 3.2. Implementation Modular Q-learning is employed in the robot soccer system to improve cooperation among the robots in the team so as to carry out zone defense strategies. Fig. 5. The three learning regions in the non-blocked situation.

6 114 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) The angle error between the robot heading direction and the ball three levels. 4. A binary flag: 1 ifr x >B x, and (B y 4 3 B w)<r y <(B y B w), 0 otherwise. Fig. 6. The coupled agent. corresponding to the three cases in the non-blocked situation. To apply modular Q-learning to the robot soccer system, the concept of coupled agents is introduced as shown in Fig. 6. When the ball is within the boundary region of two robots, both the robots will be in a position to kick the ball. This may lead to their collision. To solve this problem, the concept of coupled agents composed of these two robots is proposed. For example, if the ball is located in Region 1, Attacker 1 and Defender 1 are considered as a coupled agent. The mediator module will assign an action to each robot in a coupled agent based on (7). The action is either to kick the ball or to maintain its current position. For learning, the initial Q-values are randomly determined in the range [0, 0.02]. The learning rate α is set to 0.9 for fast convergence during the learning process. The discount factor γ is set to 0.3, a relatively low value to reduce the possible noise effect. The noise effect arises because γ gets multiplied by the maximum Q-value of the next state. The reason for choosing the low discount factor γ is that in real experiments it is possible for the robot to kick the ball unexpectedly. In such cases, the Q-value is updated as a reward. This is not desirable for precise learning. Considering this situation, as a compensation, the discount factor γ should be set to a low value Non-blocked situation First, consider Region 1, where a state in the learning module of an individual agent consists of five components: 1. The robot location two levels (either Area 1 or Area 3 occupied by the robot). 2. The difference in distance d four levels. 5. A binary flag: 1 if the other robot in a coupled agent is to kick the ball, 0 otherwise. Considering d 1, the distance between Robot 1 and the ball, and d 3, the distance between Robot 3 and the ball, d is computed as the difference in distance between d 1 and d 3, i.e., d = d 1 d 3. So, d is in fact the negative of d i in Eq. (7). Considering the learning module of Robot 1, R x (R y )isthex (Y) coordinate of the robot, B x (B y )isthex (Y) coordinate of the ball, and B w is the width of the ball. From the learning module of Robot 3, d = d 3 d 1. Fig. 7(a) (e) shows each component of the state for learning in the non-blocked situation. There are 96 states in each of the three cases in the non-blocked situation. For example, in Region 1, Robot 1 and Robot 3 form the coupled agent and each robot has a learning module which has 96 states. The mediator module selects the final action of each robot of the coupled agent. In Region 2, the first component of the state is either Area 1 or Area 2, the second component of the state is d = d 1 d 2 from the viewpoint of Robot 1 and d = d 2 d 1 from the viewpoint of Robot 2, where d 1 is the distance between Robot 1 and the ball and d 2 the distance between Robot 2 and the ball. The states of Region 3 are similar to those of Region 1, as it is symmetric. Table 1 shows the list of actions which the coupled agent will select in Regions 1 and 2. For example, if Action 1 is selected in Region 1, Robot 1 shall be Attacker 1 and Robot 3 shall be Defender 1. When the ball moves to Region 3, Robot 2 becomes Attacker 2 and Robot 4 assumes defense and becomes Defender 2. The reward is assigned as follows: r = a b +, (8) t 1 + t const1 t 2 + t const2

7 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 7. State components in Region 1. Table 1 Actions of the coupled agent in the non-blocked situation Region 1 Region 2 Actions Robot 1 Robot 3 Actions Robot 1 Robot 2 1 Attacker 1 Defender 1 1 Attacker 1 Attacker 2 2 Attacker 1 Kicker 2 Attacker 1 Kicker 3 Kicker Defender 1 3 Kicker Attacker 2

8 116 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 8. Two cases of the blocked situation. where t 1 is the time taken by the kicking robot to kick the ball and t 2 the time taken by the ball to reach the boundary of any other robot which is situated outside the learning region. a, b, t const1 and t const2 are constants. t const1 and t const2 are used for prohibiting the reward value from increasing to infinite value Blocked situation The states and the reward in the blocked situation are similar to those of the non-blocked situation. In the zone defense scheme, there are two cases where the attacker cannot execute its own role because of the blocking. Fig. 8 shows the cases in the blocked situation. Consider Region 1, where a state in the learning module consists of four components as follows: 1. The robot location two levels (either Area 2 or Area 3). 2. A binary flag: { 1 ifry <B y, 0 otherwise. 3. Blocking flag, difference in distance level (four levels), angle error level (three levels) 13 levels. 4. A binary flag: 1 if the other robot in a coupled agent is to kick the ball, 0 otherwise, where R y is the Y coordinate of the position of the blocked robot and B y the Y coordinate of the ball. In Fig. 9. State components in the blocked situation.

9 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Table 2 Actions of the coupled agent in the blocked situation Region 1 Region 2 Actions Robot 2 Robot 3 Actions Robot 1 Robot 4 1 Attacker 2 Defender 1 1 Attacker 1 Defender 2 2 Attacker 2 Kicker 2 Attacker 1 Kicker 3 Kicker Defender 1 3 Kicker Defender 2 Region 2, the first component of the state is either Area 1 or Area 4. Fig. 9(a) (d) shows each component of the state for learning in the blocked situation. There are 104 states in each of the two cases in the blocked situation. Table 2 shows actions which the coupled agent can take in Regions 1 and 2. In Region 2, the action is similar to that of coupled agent in Region 1. The reward is assigned as a r =, (9) t + t const1 where t is the time taken by the kicking robot to kick the ball, and a and t const1 are constants. 4. Experimental results The mediator module is working when both of the robots in a coupled agent tend to kick the ball. The action of the coupled agent is selected by considering the Q-value obtained from the learning modules and the state information. The angle error and the distance to the other robot in the coupled agent are used as the state information. In the selection equation (7) of the mediator module, f (Q i (s, a), θ i, d i )is given by f(q i (s i, a), θ i,d i ) = η Qi Q i (s i,a)+ η θi θ i + η di d i, (10) where η Qi, η θi and η di are constant coefficients. In the experiment, the values, 0.5, 0.3 and 0.2, respectively, were used. The mediator module selects the final action of each robot of the coupled agent based on the modified Q-value and the state information. The sampling time used in the real robot soccer system is 18 ms Non-blocked situation In Eq. (8), for the reward a = 12,000, b = 6, t const1 = 18 and t const2 = 3 were used, where a is the time interval selected for the kicking robot to kick the ball and is limited to the millisecond range. b is obtained by the experiment and t const2 is determined heuristically. In Region 1 of the non-blocked situation, it took 280 trials to obtain the Q-value which is being considered as a suboptimal value. In Region 2, it took 210 trials. The Q-values of the third region were the same as those of the first region. Fig. 10(a) shows the trajectories of the two robots when the ball is in Region 1. After the learning phase, Robot 3 took up the task of kicking the ball. It took s (155 steps) to kick the ball. Had Robot 1 assumed this task, it would have taken s (167 steps) to do so. Instead Robot 1 took up position in the defense zone left vacant by Robot 3. In Fig. 10(a), initial positions of the ball, Robot 1 and Robot 3 were (46.50 cm, cm), (86.05 cm, cm) and (22.21 cm, cm), respectively. Fig. 10(b) shows the trajectories of the two robots when the ball is in Region 2. Robot 1 kicked the ball after learning. It took s (94 steps) to kick the ball. If the other robot in the coupled agent were to kick the ball, it would have taken s (103 steps). Initial positions of the ball, Robot 1 and Robot 2 were (97.56 cm, cm), (80.99 cm, cm) and (72.02 cm, cm), respectively Blocked situation For the reward in the blocked situation, a = 20,000 and t const1 = 18 were used. These values were determined similar to the non-blocked situation.

10 118 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 10. Non-blocked situation: (a) Robot 3 kicked the ball after the learning phase in Region 1; (b) Robot 1 kicked the ball after the learning phase in Region 2. Three hundred trials were needed for convergence in the first region of the blocked situation. The Q-values of the second region were the same as those of the first region because they are symmetric. Fig. 11 shows the trajectories of the two robots in the blocked situation in Region 1. Robot 2 assisted the blocked Robot 1 and it took s to kick the ball (if Robot 3 was a kicker it would have taken s). Initial positions of the ball, Robot 2 and Robot 3 Fig. 11. Robot 2 assists Robot 1 after learning in the blocked situation. were (91.83 cm, cm), (70.78 cm, cm) and (21.74 cm, cm), respectively Effect of the modified Q-value in the mediator module In the above results, the mediator module did not use any of the state informations to determine the action of the coupled agent. The effectiveness of the modified Q-value in the mediator module which makes the final selection of the coupled agent are brought out in the real experiment. Fig. 12(a) shows the trajectories of the two robots in the non-blocked situation in Region 2. In this case the mediator module arbitrates the action of the two robots. As shown in Fig. 12(a), Robot 2 kicked the ball. It took 100 steps (1.800 s) to do so. Considering the Q-value of the learning modules, the Q-value of the kick action was larger than that of the other actions in each robot. The mediator module selects the final action which has a larger Q-value. Initial positions of the ball, Robot 1 and Robot 2 were (95.26 cm, cm), (76.62 cm, cm) and (74.01 cm, cm), respectively. Fig. 12(b) shows the same situation as that of Fig. 12(a). In this case, the mediator module considers the Q-value received from each learning module and the state information described in (7). In this situation,

11 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 12. Effect of the modified Q-value: (a) Robot 2 kicked ball after learning: with only Q-value; (b) Robot 1 kicked ball after learning: with Q-value and state information. Robot 1 kicked the ball. It took 83 steps (1.494 s) for this action. It may be noted that the time for Robot 1 to kick the ball (Fig. 12(b)) is shorter than that of the time taken by Robot 2 to do so (Fig. 12(a)) Boundary region of four robots As shown in Fig. 13, in the non-blocked situation, it is possible for the ball to be in the common region of the three regions. In this case, the coupled agent includes four robots and the problem is how to select the right robot for the kicking action. All four robots will have two Q-values which are obtained in three regions in the non-blocked situation. Q 11 and Q 13 are determined in Region 1 and Q 21 and Q 22 are obtained in Region 2 in the non-blocked situation. Q 32, Q 34, Q 43 and Q 44 are decided in Region 3 and Region 4 (Fig. 14). Q-values of Regions 3 and 4 are the same as those of Regions 1 and 2. In Regions 1 3 in the non-blocked situation, only two robots need be chosen as the coupled agent for learning. In the situation now considered, four robots form a coupled agent and kicking has to be assigned to one of them. The mediator module arbitrates the robots when Q-value of each robot for kick action is larger than that of the Q-value for retaining its position. In such a situation, the average of the Q-values of each robot in the two regions considered becomes the deciding factor in the mediator module. Together with Fig. 13. The ball is located in boundary region. Fig. 14. Q-value of the four robots.

12 120 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 15. After learning phase in the boundary region: (a) Robot 1 kicked the ball: with only Q-value; (b) Robot 3 kicked the ball: with Q-value and state information. this average value, the state information in Eq. (7) is also being considered for assigning the kick action. It may be noted that in this situation, d i in Eq. (7) is computed as the sum of the distances between the ith robot and each of the remaining robots in the coupled agent. Fig. 15(a) shows the trajectories of the four robot when the ball was in the boundary region of the four robots. Considering only the Q-value information received from the learning modules, the kick action was assigned to Robot 1 and it took s (218 steps) to kick the ball. Initial positions of the ball, Robot 1, Robot 2, Robot 3 and Robot 4 were (47.86 cm, cm), (83.50 cm, cm), (88.35 cm, cm), (24.62 cm, cm) and (23.92 cm, cm), respectively. The 1, 2, 3, 4 and B in Fig. 15 denote the initial positions of Robots 1, 2, 3, 4 and ball, respectively. These symbols are used in all other figures. In this initial position, the Q-value of Robot 1 as kicker is greater than its Q-value as Fig. 16. Other two cases in the boundary region: (a) Attacker 2 kicked the ball; (b) Defender 2 kicked the ball.

13 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Attacker 1. For Robot 3 also, its Q-value for kick action is seen to be greater than its Q-value as Defender 1. In the case of Robot 2, the Q-value for the kick action is seen to be smaller than its Q-value as Attacker 2 and for Robot 4, its Q-value as a kicker is smaller than its Q-value as Defender 2. Between Robot 1 and Robot 3, Robot 1 has a higher Q-value as a kicker. However, the choice of assigning the kick action is now decided by the mediator module using the modified Q-value which is obtained by considering the state information. Then Robot 3 which has a greater modified Q-value qualifies to be a kicker. Fig. 15(b) shows the trajectories of the four robots when Robot 3 kicked the ball. It took s (123 steps) to kick the ball. The time that would have been taken by Robot 2 and Robot 4 to kick the ball was also investigated. These situations are depicted in Fig. 16(a) and (b), respectively. It took s (143 steps) for Robot 2 and s (132 steps) for Robot 4 to kick the ball. Thus, the experiment asserted the theory that the time taken for kick action by a robot that is assigned this task using the modified Q-value method is the least. 5. Conclusions This paper proposed an action selection mechanism among the robots in a robot soccer game. The action selection problem of the zone defense scheme is divided into two situations: non-blocked case and blocked case by its opponent. Non-blocked case is the situation of conflict among the home robots near the boundary region. Blocked case corresponds to the situation of a home robot being blocked by opponent robots. The modular Q-learning architecture was used to solve the action selection problem which specifically selects the robot that needs the least time to kick the ball and assign this task to it. The concept of the coupled agent was used to resolve a conflict in action selection among robots. A uni-vector method was employed for navigation of the robots. The mediator module selects the final action of the coupled agent by considering the Q-value received from the learning modules and the state information. The effectiveness of the scheme was demonstrated through real robot soccer experiments. References [1] R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, Bradford Books/MIT Press, Cambridge, MA, [2] C.J.C.H. Watkins, P. Dayan, Q-learning, Machine Learning 8 (1992) [3] C.R. Kube, H. Zhang, Collective robotics: From social insects to robot, Adaptive Behavior 2 (2) (1993) [4] G. Campion, G. Bastin, D Andréa-Novel, Structural properties and classification of kinematic and dynamic models of wheeled mobile robots, IEEE Transactions on Robotics and Automation 12 (1) (1996) [5] L.P. Kaelbling, M.L. Littman, A.W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996) [6] T.W. Sandholm, R.H. Crites, Multiagent reinforcement learning in the Iterated Prisoner s Dilemma, Biosystems 37 (1996) [7] H.-S. Shim, H.-S. Kim, M.-J. Jung, I.-H. Choi, J.-H. Kim, J.-O. Kim, Designing distributed control architecture for cooperative multi-agent system and its real-time application to soccer robot, Robotics and Autonomous Systems 21 (2) (1997) [8] P.V.C. Caironi, M. Dorigo, Training and delay reinforcements in Q-learning agents, International Journal of Intelligent Systems 12 (10) (1997) [9] L.E. Parker, Alliance: An architecture for fault tolerant multirobot cooperation, IEEE Transactions on Robotics and Automation 14 (2) (1998) [10] J.-H. Kim, H.-S. Shim, H.-S. Kim, M.-J. Jung, I.-H. Choi, K.-O. Kim, A cooperative multi-agent system and its real time application to robot soccer, in: Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN, 1996, pp [11] C. Boutilier, Planning, learning and coordination in multiagent decision processes, in: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge, Netherlands, [12] S.H. Lee, J. Bautista, Motion control for micro-robots playing soccer games, in: Proceedings of the IEEE International Conference on Robotics and Automation, Leuven, Belgium, 1998, pp [13] J.-H. Kim, K.-C. Kim, D.-H. Kim, Y.-J. Kim, P. Vadakkepat, Path planning and role selection mechanism for soccer robots, in: Proceedings of the IEEE International Conference on Robotics and Automation, Leuven, Belgium, 1998, pp [14] Y.-J. Kim, D.-H. Kim, J.-H. Kim, Evolutionary programming-based vector field method for fast mobile robot navigation, in: Proceedings of the Second Asia Pacific Conference on Simulations, Evolutions and Learning, [15] N. Ono, K. Fukumoto, Multi-agent reinforcement learning: A modular approach, in: Proceedings of the Second International Conference on Multi-agent Systems, AAAI Press, 1996, pp [16] G.A. Rummery, Problem solving with reinforcement learning, Ph.D. Thesis, Cambridge University, Cambridge, UK, 1995.

14 122 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Kui-Hong Park received his B.S. degree in Electrical Engineering from Hanyang University, Seoul, South Korea, in 1997 and M.S. degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST), Taejon, South Korea, in 1998, respectively. He is currently working towards the Ph.D. degree in Electrical Engineering at KAIST. His main research interests include multi-agent systems and machine intelligence. Mr. Park received the second awards both in the 1998 Nano-Robot World Cup Soccer Tournament (NaroSot) in France and in the NaroSot 99 in Brazil. Yong-Jae Kim received his B.S. and M.S. degrees in Electrical Engineering from Korea Advanced Institute of Science and Technology, Taejon, South Korea, in 1996 and 1998, respectively. He is currently working towards the Ph.D. degree in Electrical Engineering at this institute. His research interests include motion planning of mobile systems, and machine intelligence. Mr. Kim is the recipient of the third award at the MiroSot 97 and the first award at the Robot Soccer American Cup in Jong-Hwan Kim received his B.S., M.S., and Ph.D. degrees in Electronics Engineering from Seoul National University, South Korea, in 1981, 1983, and 1987, respectively. Since 1988, he has been with the Department of Electrical Engineering at the Korea Advanced Institute of Science and Technology, where he is currently a Professor. He was a Visiting Scholar at Purdue University from September 1992 to August His research interests are in the areas of evolutionary multi-agent robotic systems. He is the Associate Editor of the IEEE Transactions on Evolutionary Computation, and of the International Journal of Intelligent and Fuzzy Systems. He is one of the co-founders of Asia Pacific Conference on Simulated Evolution and Learning. He is the General Chair of the Congress on Evolutionary Computation His name is included in the Barons 500 Leaders for the New Century as the Founder of FIRA (Federation of International Robot Soccer Association) and of IROC for Robot Olympiad. He is now serving FIRA and IROC as President. He was the Guest Editor of the special issue on MiroSot 96 of the journal Robotics and Autonomous Systems and on Soccer Robotics of the Journal of Intelligent Automation and Soft Computing. Dr. Kim is the recipient of the 1988 Choongang Young Investigator Award from Choongang Memorial Foundation, the LG YonAm Foundation Research Fellowship in 1992, the Korean Presidential Award in 1997, and the SeoAm Foundation Research Fellowship in 1999.

Multi-Agent Control Structure for a Vision Based Robot Soccer System

Multi-Agent Control Structure for a Vision Based Robot Soccer System Multi- Control Structure for a Vision Based Robot Soccer System Yangmin Li, Wai Ip Lei, and Xiaoshan Li Department of Electromechanical Engineering Faculty of Science and Technology University of Macau

More information

Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots

Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots Gregor Novak 1 and Martin Seyr 2 1 Vienna University of Technology, Vienna, Austria novak@bluetechnix.at 2 Institute

More information

Rapid Control Prototyping for Robot Soccer

Rapid Control Prototyping for Robot Soccer Proceedings of the 17th World Congress The International Federation of Automatic Control Rapid Control Prototyping for Robot Soccer Junwon Jang Soohee Han Hanjun Kim Choon Ki Ahn School of Electrical Engr.

More information

Available online at ScienceDirect. Procedia Computer Science 56 (2015 )

Available online at  ScienceDirect. Procedia Computer Science 56 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 56 (2015 ) 538 543 International Workshop on Communication for Humans, Agents, Robots, Machines and Sensors (HARMS 2015)

More information

Strategy for Collaboration in Robot Soccer

Strategy for Collaboration in Robot Soccer Strategy for Collaboration in Robot Soccer Sng H.L. 1, G. Sen Gupta 1 and C.H. Messom 2 1 Singapore Polytechnic, 500 Dover Road, Singapore {snghl, SenGupta }@sp.edu.sg 1 Massey University, Auckland, New

More information

A Posture Control for Two Wheeled Mobile Robots

A Posture Control for Two Wheeled Mobile Robots Transactions on Control, Automation and Systems Engineering Vol., No. 3, September, A Posture Control for Two Wheeled Mobile Robots Hyun-Sik Shim and Yoon-Gyeoung Sung Abstract In this paper, a posture

More information

Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot

Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot Quy-Hung Vu, Byeong-Sang Kim, Jae-Bok Song Korea University 1 Anam-dong, Seongbuk-gu, Seoul, Korea vuquyhungbk@yahoo.com, lovidia@korea.ac.kr,

More information

Online Evolution for Cooperative Behavior in Group Robot Systems

Online Evolution for Cooperative Behavior in Group Robot Systems 282 International Dong-Wook Journal of Lee, Control, Sang-Wook Automation, Seo, and Systems, Kwee-Bo vol. Sim 6, no. 2, pp. 282-287, April 2008 Online Evolution for Cooperative Behavior in Group Robot

More information

Hierarchical Controller for Robotic Soccer

Hierarchical Controller for Robotic Soccer Hierarchical Controller for Robotic Soccer Byron Knoll Cognitive Systems 402 April 13, 2008 ABSTRACT RoboCup is an initiative aimed at advancing Artificial Intelligence (AI) and robotics research. This

More information

Motion Control of a Three Active Wheeled Mobile Robot and Collision-Free Human Following Navigation in Outdoor Environment

Motion Control of a Three Active Wheeled Mobile Robot and Collision-Free Human Following Navigation in Outdoor Environment Proceedings of the International MultiConference of Engineers and Computer Scientists 2016 Vol I,, March 16-18, 2016, Hong Kong Motion Control of a Three Active Wheeled Mobile Robot and Collision-Free

More information

Behavior generation for a mobile robot based on the adaptive fitness function

Behavior generation for a mobile robot based on the adaptive fitness function Robotics and Autonomous Systems 40 (2002) 69 77 Behavior generation for a mobile robot based on the adaptive fitness function Eiji Uchibe a,, Masakazu Yanase b, Minoru Asada c a Human Information Science

More information

COOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS

COOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS COOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS Soft Computing Alfonso Martínez del Hoyo Canterla 1 Table of contents 1. Introduction... 3 2. Cooperative strategy design...

More information

Robo-Erectus Jr-2013 KidSize Team Description Paper.

Robo-Erectus Jr-2013 KidSize Team Description Paper. Robo-Erectus Jr-2013 KidSize Team Description Paper. Buck Sin Ng, Carlos A. Acosta Calderon and Changjiu Zhou. Advanced Robotics and Intelligent Control Centre, Singapore Polytechnic, 500 Dover Road, 139651,

More information

Randomized Motion Planning for Groups of Nonholonomic Robots

Randomized Motion Planning for Groups of Nonholonomic Robots Randomized Motion Planning for Groups of Nonholonomic Robots Christopher M Clark chrisc@sun-valleystanfordedu Stephen Rock rock@sun-valleystanfordedu Department of Aeronautics & Astronautics Stanford University

More information

Q Learning Behavior on Autonomous Navigation of Physical Robot

Q Learning Behavior on Autonomous Navigation of Physical Robot The 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI 211) Nov. 23-26, 211 in Songdo ConventiA, Incheon, Korea Q Learning Behavior on Autonomous Navigation of Physical Robot

More information

CMDragons 2009 Team Description

CMDragons 2009 Team Description CMDragons 2009 Team Description Stefan Zickler, Michael Licitra, Joydeep Biswas, and Manuela Veloso Carnegie Mellon University {szickler,mmv}@cs.cmu.edu {mlicitra,joydeep}@andrew.cmu.edu Abstract. In this

More information

Multi-robot Formation Control Based on Leader-follower Method

Multi-robot Formation Control Based on Leader-follower Method Journal of Computers Vol. 29 No. 2, 2018, pp. 233-240 doi:10.3966/199115992018042902022 Multi-robot Formation Control Based on Leader-follower Method Xibao Wu 1*, Wenbai Chen 1, Fangfang Ji 1, Jixing Ye

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

COMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION

COMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION COMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION Handy Wicaksono, Khairul Anam 2, Prihastono 3, Indra Adjie Sulistijono 4, Son Kuswadi 5 Department of Electrical Engineering, Petra Christian

More information

National University of Singapore

National University of Singapore National University of Singapore Department of Electrical and Computer Engineering EE4306 Distributed Autonomous obotic Systems 1. Objectives...1 2. Equipment...1 3. Preparation...1 4. Introduction...1

More information

Multi-Platform Soccer Robot Development System

Multi-Platform Soccer Robot Development System Multi-Platform Soccer Robot Development System Hui Wang, Han Wang, Chunmiao Wang, William Y. C. Soh Division of Control & Instrumentation, School of EEE Nanyang Technological University Nanyang Avenue,

More information

Robocup Electrical Team 2006 Description Paper

Robocup Electrical Team 2006 Description Paper Robocup Electrical Team 2006 Description Paper Name: Strive2006 (Shanghai University, P.R.China) Address: Box.3#,No.149,Yanchang load,shanghai, 200072 Email: wanmic@163.com Homepage: robot.ccshu.org Abstract:

More information

NAVIGATION OF MOBILE ROBOT USING THE PSO PARTICLE SWARM OPTIMIZATION

NAVIGATION OF MOBILE ROBOT USING THE PSO PARTICLE SWARM OPTIMIZATION Journal of Academic and Applied Studies (JAAS) Vol. 2(1) Jan 2012, pp. 32-38 Available online @ www.academians.org ISSN1925-931X NAVIGATION OF MOBILE ROBOT USING THE PSO PARTICLE SWARM OPTIMIZATION Sedigheh

More information

RoboCup. Presented by Shane Murphy April 24, 2003

RoboCup. Presented by Shane Murphy April 24, 2003 RoboCup Presented by Shane Murphy April 24, 2003 RoboCup: : Today and Tomorrow What we have learned Authors Minoru Asada (Osaka University, Japan), Hiroaki Kitano (Sony CS Labs, Japan), Itsuki Noda (Electrotechnical(

More information

Obstacle Avoidance Functions on Robot Mirosot in The Departement of Informatics of UPN Veteran Yogyakarta

Obstacle Avoidance Functions on Robot Mirosot in The Departement of Informatics of UPN Veteran Yogyakarta Proceeding International Conference on Electrical Engineering, Computer Science Informatics (EECSI 2015), Palembang, Indonesia, 19-20 August 2015 Obstacle Avoidance Functions on Robot Mirosot in Departement

More information

Decision Science Letters

Decision Science Letters Decision Science Letters 3 (2014) 121 130 Contents lists available at GrowingScience Decision Science Letters homepage: www.growingscience.com/dsl A new effective algorithm for on-line robot motion planning

More information

Fuzzy Logic for Behaviour Co-ordination and Multi-Agent Formation in RoboCup

Fuzzy Logic for Behaviour Co-ordination and Multi-Agent Formation in RoboCup Fuzzy Logic for Behaviour Co-ordination and Multi-Agent Formation in RoboCup Hakan Duman and Huosheng Hu Department of Computer Science University of Essex Wivenhoe Park, Colchester CO4 3SQ United Kingdom

More information

Team KMUTT: Team Description Paper

Team KMUTT: Team Description Paper Team KMUTT: Team Description Paper Thavida Maneewarn, Xye, Pasan Kulvanit, Sathit Wanitchaikit, Panuvat Sinsaranon, Kawroong Saktaweekulkit, Nattapong Kaewlek Djitt Laowattana King Mongkut s University

More information

ROBOTSOCCER. Peter Kopacek

ROBOTSOCCER. Peter Kopacek Proceedings of the 17th World Congress The International Federation of Automatic Control ROBOTSOCCER Peter Kopacek Intelligent Handling and Robotics (IHRT),Vienna University of Technology Favoritenstr.

More information

The Necessity of Average Rewards in Cooperative Multirobot Learning

The Necessity of Average Rewards in Cooperative Multirobot Learning Carnegie Mellon University Research Showcase @ CMU Institute for Software Research School of Computer Science 2002 The Necessity of Average Rewards in Cooperative Multirobot Learning Poj Tangamchit Carnegie

More information

S.P.Q.R. Legged Team Report from RoboCup 2003

S.P.Q.R. Legged Team Report from RoboCup 2003 S.P.Q.R. Legged Team Report from RoboCup 2003 L. Iocchi and D. Nardi Dipartimento di Informatica e Sistemistica Universitá di Roma La Sapienza Via Salaria 113-00198 Roma, Italy {iocchi,nardi}@dis.uniroma1.it,

More information

A GAME THEORETIC MODEL OF COOPERATION AND NON-COOPERATION FOR SOCCER PLAYING ROBOTS. M. BaderElDen, E. Badreddin, Y. Kotb, and J.

A GAME THEORETIC MODEL OF COOPERATION AND NON-COOPERATION FOR SOCCER PLAYING ROBOTS. M. BaderElDen, E. Badreddin, Y. Kotb, and J. A GAME THEORETIC MODEL OF COOPERATION AND NON-COOPERATION FOR SOCCER PLAYING ROBOTS M. BaderElDen, E. Badreddin, Y. Kotb, and J. Rüdiger Automation Laboratory, University of Mannheim, 68131 Mannheim, Germany.

More information

Cooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution

Cooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution Cooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution Eiji Uchibe, Masateru Nakamura, Minoru Asada Dept. of Adaptive Machine Systems, Graduate School of Eng., Osaka University,

More information

Obstacle Avoidance in Collective Robotic Search Using Particle Swarm Optimization

Obstacle Avoidance in Collective Robotic Search Using Particle Swarm Optimization Avoidance in Collective Robotic Search Using Particle Swarm Optimization Lisa L. Smith, Student Member, IEEE, Ganesh K. Venayagamoorthy, Senior Member, IEEE, Phillip G. Holloway Real-Time Power and Intelligent

More information

Estimation of Absolute Positioning of mobile robot using U-SAT

Estimation of Absolute Positioning of mobile robot using U-SAT Estimation of Absolute Positioning of mobile robot using U-SAT Su Yong Kim 1, SooHong Park 2 1 Graduate student, Department of Mechanical Engineering, Pusan National University, KumJung Ku, Pusan 609-735,

More information

Adaptive Action Selection without Explicit Communication for Multi-robot Box-pushing

Adaptive Action Selection without Explicit Communication for Multi-robot Box-pushing Adaptive Action Selection without Explicit Communication for Multi-robot Box-pushing Seiji Yamada Jun ya Saito CISS, IGSSE, Tokyo Institute of Technology 4259 Nagatsuta, Midori, Yokohama 226-8502, JAPAN

More information

Traffic Control for a Swarm of Robots: Avoiding Target Congestion

Traffic Control for a Swarm of Robots: Avoiding Target Congestion Traffic Control for a Swarm of Robots: Avoiding Target Congestion Leandro Soriano Marcolino and Luiz Chaimowicz Abstract One of the main problems in the navigation of robotic swarms is when several robots

More information

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization Learning to avoid obstacles Outline Problem encoding using GA and ANN Floreano and Mondada

More information

FU-Fighters. The Soccer Robots of Freie Universität Berlin. Why RoboCup? What is RoboCup?

FU-Fighters. The Soccer Robots of Freie Universität Berlin. Why RoboCup? What is RoboCup? The Soccer Robots of Freie Universität Berlin We have been building autonomous mobile robots since 1998. Our team, composed of students and researchers from the Mathematics and Computer Science Department,

More information

A Lego-Based Soccer-Playing Robot Competition For Teaching Design

A Lego-Based Soccer-Playing Robot Competition For Teaching Design Session 2620 A Lego-Based Soccer-Playing Robot Competition For Teaching Design Ronald A. Lessard Norwich University Abstract Course Objectives in the ME382 Instrumentation Laboratory at Norwich University

More information

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts Traffic Control for a Swarm of Robots: Avoiding Group Conflicts Leandro Soriano Marcolino and Luiz Chaimowicz Abstract A very common problem in the navigation of robotic swarms is when groups of robots

More information

Introduction to Robotics

Introduction to Robotics Jianwei Zhang zhang@informatik.uni-hamburg.de Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme 14. June 2013 J. Zhang 1 Robot Control

More information

Behaviour Patterns Evolution on Individual and Group Level. Stanislav Slušný, Roman Neruda, Petra Vidnerová. CIMMACS 07, December 14, Tenerife

Behaviour Patterns Evolution on Individual and Group Level. Stanislav Slušný, Roman Neruda, Petra Vidnerová. CIMMACS 07, December 14, Tenerife Behaviour Patterns Evolution on Individual and Group Level Stanislav Slušný, Roman Neruda, Petra Vidnerová Department of Theoretical Computer Science Institute of Computer Science Academy of Science of

More information

An Intuitional Method for Mobile Robot Path-planning in a Dynamic Environment

An Intuitional Method for Mobile Robot Path-planning in a Dynamic Environment An Intuitional Method for Mobile Robot Path-planning in a Dynamic Environment Ching-Chang Wong, Hung-Ren Lai, and Hui-Chieh Hou Department of Electrical Engineering, Tamkang University Tamshui, Taipei

More information

A Fuzzy-Based Approach for Partner Selection in Multi-Agent Systems

A Fuzzy-Based Approach for Partner Selection in Multi-Agent Systems University of Wollongong Research Online Faculty of Informatics - Papers Faculty of Informatics 07 A Fuzzy-Based Approach for Partner Selection in Multi-Agent Systems F. Ren University of Wollongong M.

More information

MCT Susanoo Logics 2014 Team Description

MCT Susanoo Logics 2014 Team Description MCT Susanoo Logics 2014 Team Description Satoshi Takata, Yuji Horie, Shota Aoki, Kazuhiro Fujiwara, Taihei Degawa Matsue College of Technology 14-4, Nishiikumacho, Matsue-shi, Shimane, 690-8518, Japan

More information

A New Analytical Representation to Robot Path Generation with Collision Avoidance through the Use of the Collision Map

A New Analytical Representation to Robot Path Generation with Collision Avoidance through the Use of the Collision Map International A New Journal Analytical of Representation Control, Automation, Robot and Path Systems, Generation vol. 4, no. with 1, Collision pp. 77-86, Avoidance February through 006 the Use of 77 A

More information

Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors

Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors In: M.H. Hamza (ed.), Proceedings of the 21st IASTED Conference on Applied Informatics, pp. 1278-128. Held February, 1-1, 2, Insbruck, Austria Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors

More information

Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots

Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots Philippe Lucidarme, Alain Liégeois LIRMM, University Montpellier II, France, lucidarm@lirmm.fr Abstract This paper presents

More information

Hybrid LQG-Neural Controller for Inverted Pendulum System

Hybrid LQG-Neural Controller for Inverted Pendulum System Hybrid LQG-Neural Controller for Inverted Pendulum System E.S. Sazonov Department of Electrical and Computer Engineering Clarkson University Potsdam, NY 13699-570 USA P. Klinkhachorn and R. L. Klein Lane

More information

Learning Attentive-Depth Switching while Interacting with an Agent

Learning Attentive-Depth Switching while Interacting with an Agent Learning Attentive-Depth Switching while Interacting with an Agent Chyon Hae Kim, Hiroshi Tsujino, and Hiroyuki Nakahara Abstract This paper addresses a learning system design for a robot based on an extended

More information

A Novel Fuzzy Neural Network Based Distance Relaying Scheme

A Novel Fuzzy Neural Network Based Distance Relaying Scheme 902 IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 15, NO. 3, JULY 2000 A Novel Fuzzy Neural Network Based Distance Relaying Scheme P. K. Dash, A. K. Pradhan, and G. Panda Abstract This paper presents a new

More information

An Improved Path Planning Method Based on Artificial Potential Field for a Mobile Robot

An Improved Path Planning Method Based on Artificial Potential Field for a Mobile Robot BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No Sofia 015 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-015-0037 An Improved Path Planning Method Based

More information

The Haptic Impendance Control through Virtual Environment Force Compensation

The Haptic Impendance Control through Virtual Environment Force Compensation The Haptic Impendance Control through Virtual Environment Force Compensation OCTAVIAN MELINTE Robotics and Mechatronics Department Institute of Solid Mechanicsof the Romanian Academy ROMANIA octavian.melinte@yahoo.com

More information

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball Masaki Ogino 1, Masaaki Kikuchi 1, Jun ichiro Ooga 1, Masahiro Aono 1 and Minoru Asada 1,2 1 Dept. of Adaptive Machine

More information

Trajectory Generation for a Mobile Robot by Reinforcement Learning

Trajectory Generation for a Mobile Robot by Reinforcement Learning 1 Trajectory Generation for a Mobile Robot by Reinforcement Learning Masaki Shimizu 1, Makoto Fujita 2, and Hiroyuki Miyamoto 3 1 Kyushu Institute of Technology, Kitakyushu, Japan shimizu-masaki@edu.brain.kyutech.ac.jp

More information

APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION

APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION Handy Wicaksono 1, Prihastono 2, Khairul Anam 3, Rusdhianto Effendi 4, Indra Adji Sulistijono 5, Son Kuswadi 6, Achmad Jazidie

More information

A Reconfigurable Guidance System

A Reconfigurable Guidance System Lecture tes for the Class: Unmanned Aircraft Design, Modeling and Control A Reconfigurable Guidance System Application to Unmanned Aerial Vehicles (UAVs) y b right aileron: a2 right elevator: e 2 rudder:

More information

Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units

Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units Sromona Chatterjee, Timo Nachstedt, Florentin Wörgötter, Minija Tamosiunaite, Poramate

More information

DEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR

DEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR Proceedings of IC-NIDC2009 DEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR Jun Won Lim 1, Sanghoon Lee 2,Il Hong Suh 1, and Kyung Jin Kim 3 1 Dept. Of Electronics and Computer Engineering,

More information

Micro Robot Hockey Simulator Game Engine Design

Micro Robot Hockey Simulator Game Engine Design Micro Robot Hockey Simulator Game Engine Design Wayne Y. Chen Experimental Robotics Laboratory School of Engineering Science Simon Fraser University, Burnaby, BC, Canada waynec@fas.sfu.ca Shahram Payandeh

More information

CS594, Section 30682:

CS594, Section 30682: CS594, Section 30682: Distributed Intelligence in Autonomous Robotics Spring 2003 Tuesday/Thursday 11:10 12:25 http://www.cs.utk.edu/~parker/courses/cs594-spring03 Instructor: Dr. Lynne E. Parker ½ TA:

More information

CMDragons 2006 Team Description

CMDragons 2006 Team Description CMDragons 2006 Team Description James Bruce, Stefan Zickler, Mike Licitra, and Manuela Veloso Carnegie Mellon University Pittsburgh, Pennsylvania, USA {jbruce,szickler,mlicitra,mmv}@cs.cmu.edu Abstract.

More information

APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION

APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION Handy Wicaksono 1,2, Prihastono 1,3, Khairul Anam 4, Rusdhianto Effendi 2, Indra Adji Sulistijono 5, Son Kuswadi 5, Achmad

More information

Particle Swarm Optimization-Based Consensus Achievement of a Decentralized Sensor Network

Particle Swarm Optimization-Based Consensus Achievement of a Decentralized Sensor Network , pp.162-166 http://dx.doi.org/10.14257/astl.2013.42.38 Particle Swarm Optimization-Based Consensus Achievement of a Decentralized Sensor Network Hyunseok Kim 1, Jinsul Kim 2 and Seongju Chang 1*, 1 Department

More information

Wheeled Mobile Robot Obstacle Avoidance Using Compass and Ultrasonic

Wheeled Mobile Robot Obstacle Avoidance Using Compass and Ultrasonic Universal Journal of Control and Automation 6(1): 13-18, 2018 DOI: 10.13189/ujca.2018.060102 http://www.hrpub.org Wheeled Mobile Robot Obstacle Avoidance Using Compass and Ultrasonic Yousef Moh. Abueejela

More information

Estimation and Control of Lateral Displacement of Electric Vehicle Using WPT Information

Estimation and Control of Lateral Displacement of Electric Vehicle Using WPT Information Estimation and Control of Lateral Displacement of Electric Vehicle Using WPT Information Pakorn Sukprasert Department of Electrical Engineering and Information Systems, The University of Tokyo Tokyo, Japan

More information

Behaviour-Based Control. IAR Lecture 5 Barbara Webb

Behaviour-Based Control. IAR Lecture 5 Barbara Webb Behaviour-Based Control IAR Lecture 5 Barbara Webb Traditional sense-plan-act approach suggests a vertical (serial) task decomposition Sensors Actuators perception modelling planning task execution motor

More information

Towards Quantification of the need to Cooperate between Robots

Towards Quantification of the need to Cooperate between Robots PERMIS 003 Towards Quantification of the need to Cooperate between Robots K. Madhava Krishna and Henry Hexmoor CSCE Dept., University of Arkansas Fayetteville AR 770 Abstract: Collaborative technologies

More information

IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL

IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL * A. K. Sharma, ** R. A. Gupta, and *** Laxmi Srivastava * Department of Electrical Engineering,

More information

SIMULTANEOUS OBSTACLE DETECTION FOR MOBILE ROBOTS AND ITS LOCALIZATION FOR AUTOMATIC BATTERY RECHARGING

SIMULTANEOUS OBSTACLE DETECTION FOR MOBILE ROBOTS AND ITS LOCALIZATION FOR AUTOMATIC BATTERY RECHARGING SIMULTANEOUS OBSTACLE DETECTION FOR MOBILE ROBOTS AND ITS LOCALIZATION FOR AUTOMATIC BATTERY RECHARGING *Sang-Il Gho*, Jong-Suk Choi*, *Ji-Yoon Yoo**, Mun-Sang Kim* *Department of Electrical Engineering

More information

Implementation of Proportional and Derivative Controller in a Ball and Beam System

Implementation of Proportional and Derivative Controller in a Ball and Beam System Implementation of Proportional and Derivative Controller in a Ball and Beam System Alexander F. Paggi and Tooran Emami United States Coast Guard Academy Abstract This paper presents a design of two cascade

More information

The Autonomous Performance Improvement of Mobile Robot using Type-2 Fuzzy Self-Tuning PID Controller

The Autonomous Performance Improvement of Mobile Robot using Type-2 Fuzzy Self-Tuning PID Controller , pp.182-187 http://dx.doi.org/10.14257/astl.2016.138.37 The Autonomous Performance Improvement of Mobile Robot using Type-2 Fuzzy Self-Tuning PID Controller Sang Hyuk Park 1, Ki Woo Kim 1, Won Hyuk Choi

More information

SRV02-Series Rotary Experiment # 3. Ball & Beam. Student Handout

SRV02-Series Rotary Experiment # 3. Ball & Beam. Student Handout SRV02-Series Rotary Experiment # 3 Ball & Beam Student Handout SRV02-Series Rotary Experiment # 3 Ball & Beam Student Handout 1. Objectives The objective in this experiment is to design a controller for

More information

Test Plan. Robot Soccer. ECEn Senior Project. Real Madrid. Daniel Gardner Warren Kemmerer Brandon Williams TJ Schramm Steven Deshazer

Test Plan. Robot Soccer. ECEn Senior Project. Real Madrid. Daniel Gardner Warren Kemmerer Brandon Williams TJ Schramm Steven Deshazer Test Plan Robot Soccer ECEn 490 - Senior Project Real Madrid Daniel Gardner Warren Kemmerer Brandon Williams TJ Schramm Steven Deshazer CONTENTS Introduction... 3 Skill Tests Determining Robot Position...

More information

PWM MOTOR DRIVE CIRCUIT WITH WIRELESS COMMUNICATION TO A MICROCOMPUTER FOR SMALL PLAYING SOCCER ROBOTS

PWM MOTOR DRIVE CIRCUIT WITH WIRELESS COMMUNICATION TO A MICROCOMPUTER FOR SMALL PLAYING SOCCER ROBOTS PWM MOTOR DRIVE CIRCUIT WITH WIRELESS COMMUNICATION TO A MICROCOMPUTER FOR SMALL PLAYING SOCCER ROBOTS EWALDO L. M. MEHL, ANDERSON C. ZANI, JACKSON KÜNTZE, VILSON R. MOGNON Departamento de Engenharia Elétrica,

More information

Sensor Data Fusion Using Kalman Filter

Sensor Data Fusion Using Kalman Filter Sensor Data Fusion Using Kalman Filter J.Z. Sasiade and P. Hartana Department of Mechanical & Aerospace Engineering arleton University 115 olonel By Drive Ottawa, Ontario, K1S 5B6, anada e-mail: jsas@ccs.carleton.ca

More information

Field Rangers Team Description Paper

Field Rangers Team Description Paper Field Rangers Team Description Paper Yusuf Pranggonoh, Buck Sin Ng, Tianwu Yang, Ai Ling Kwong, Pik Kong Yue, Changjiu Zhou Advanced Robotics and Intelligent Control Centre (ARICC), Singapore Polytechnic,

More information

Evolving CAM-Brain to control a mobile robot

Evolving CAM-Brain to control a mobile robot Applied Mathematics and Computation 111 (2000) 147±162 www.elsevier.nl/locate/amc Evolving CAM-Brain to control a mobile robot Sung-Bae Cho *, Geum-Beom Song Department of Computer Science, Yonsei University,

More information

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER World Automation Congress 21 TSI Press. USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER Department of Computer Science Connecticut College New London, CT {ahubley,

More information

Reinforcement Learning Simulations and Robotics

Reinforcement Learning Simulations and Robotics Reinforcement Learning Simulations and Robotics Models Partially observable noise in sensors Policy search methods rather than value functionbased approaches Isolate key parameters by choosing an appropriate

More information

Internet Control of Personal Robot between KAIST and UC Davis

Internet Control of Personal Robot between KAIST and UC Davis Internet Control of Personal Robot between KAIST and UC Davis Kuk-Hyun Han 1, Yong-Jae Kim 1, Jong-Hwan Kim 1 and Steve Hsia 2 1 Department of Electrical Engineering and Computer Science, Korea Advanced

More information

CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS

CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS GARY B. PARKER, CONNECTICUT COLLEGE, USA, parker@conncoll.edu IVO I. PARASHKEVOV, CONNECTICUT COLLEGE, USA, iipar@conncoll.edu H. JOSEPH

More information

Soccer Server: a simulator of RoboCup. NODA Itsuki. below. in the server, strategies of teams are compared mainly

Soccer Server: a simulator of RoboCup. NODA Itsuki. below. in the server, strategies of teams are compared mainly Soccer Server: a simulator of RoboCup NODA Itsuki Electrotechnical Laboratory 1-1-4 Umezono, Tsukuba, 305 Japan noda@etl.go.jp Abstract Soccer Server is a simulator of RoboCup. Soccer Server provides an

More information

A NEURAL CONTROLLER FOR ON BOARD TRACKING PLATFORM

A NEURAL CONTROLLER FOR ON BOARD TRACKING PLATFORM A NEURAL CONTROLLER FOR ON BOARD TRACKING PLATFORM OCTAVIAN GRIGORE- MÜLER 1 Key words: Airborne warning and control systems (AWACS), Incremental motion controller, DC servomotors with low inertia induce,

More information

Improving the Kicking Accuracy in a Soccer Robot

Improving the Kicking Accuracy in a Soccer Robot Improving the Kicking Accuracy in a Soccer Robot Ricardo Dias ricardodias@ua.pt Bernardo Cunha mbc@det.ua.pt João Silva joao.m.silva@ua.pt António J. R. Neves an@ua.pt José Luis Azevedo jla@ua.pt Nuno

More information

AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS

AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS Eva Cipi, PhD in Computer Engineering University of Vlora, Albania Abstract This paper is focused on presenting

More information

KMUTT Kickers: Team Description Paper

KMUTT Kickers: Team Description Paper KMUTT Kickers: Team Description Paper Thavida Maneewarn, Xye, Korawit Kawinkhrue, Amnart Butsongka, Nattapong Kaewlek King Mongkut s University of Technology Thonburi, Institute of Field Robotics (FIBO)

More information

A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server

A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server Youngsik Kim * * Department of Game and Multimedia Engineering, Korea Polytechnic University, Republic

More information

CAMBADA 2015: Team Description Paper

CAMBADA 2015: Team Description Paper CAMBADA 2015: Team Description Paper B. Cunha, A. J. R. Neves, P. Dias, J. L. Azevedo, N. Lau, R. Dias, F. Amaral, E. Pedrosa, A. Pereira, J. Silva, J. Cunha and A. Trifan Intelligent Robotics and Intelligent

More information

Dr. Wenjie Dong. The University of Texas Rio Grande Valley Department of Electrical Engineering (956)

Dr. Wenjie Dong. The University of Texas Rio Grande Valley Department of Electrical Engineering (956) Dr. Wenjie Dong The University of Texas Rio Grande Valley Department of Electrical Engineering (956) 665-2200 Email: wenjie.dong@utrgv.edu EDUCATION PhD, University of California, Riverside, 2009 Major:

More information

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Davis Ancona and Jake Weiner Abstract In this report, we examine the plausibility of implementing a NEAT-based solution

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots

An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots Maren Bennewitz Wolfram Burgard Department of Computer Science, University of Freiburg, 7911 Freiburg, Germany maren,burgard

More information

Robots in the Loop: Supporting an Incremental Simulation-based Design Process

Robots in the Loop: Supporting an Incremental Simulation-based Design Process s in the Loop: Supporting an Incremental -based Design Process Xiaolin Hu Computer Science Department Georgia State University Atlanta, GA, USA xhu@cs.gsu.edu Abstract This paper presents the results of

More information

Biologically Inspired Embodied Evolution of Survival

Biologically Inspired Embodied Evolution of Survival Biologically Inspired Embodied Evolution of Survival Stefan Elfwing 1,2 Eiji Uchibe 2 Kenji Doya 2 Henrik I. Christensen 1 1 Centre for Autonomous Systems, Numerical Analysis and Computer Science, Royal

More information

Fuzzy-Heuristic Robot Navigation in a Simulated Environment

Fuzzy-Heuristic Robot Navigation in a Simulated Environment Fuzzy-Heuristic Robot Navigation in a Simulated Environment S. K. Deshpande, M. Blumenstein and B. Verma School of Information Technology, Griffith University-Gold Coast, PMB 50, GCMC, Bundall, QLD 9726,

More information

Navigation of Transport Mobile Robot in Bionic Assembly System

Navigation of Transport Mobile Robot in Bionic Assembly System Navigation of Transport Mobile obot in Bionic ssembly System leksandar Lazinica Intelligent Manufacturing Systems IFT Karlsplatz 13/311, -1040 Vienna Tel : +43-1-58801-311141 Fax :+43-1-58801-31199 e-mail

More information

Optimum Rate Allocation for Two-Class Services in CDMA Smart Antenna Systems

Optimum Rate Allocation for Two-Class Services in CDMA Smart Antenna Systems 810 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 5, MAY 2003 Optimum Rate Allocation for Two-Class Services in CDMA Smart Antenna Systems Il-Min Kim, Member, IEEE, Hyung-Myung Kim, Senior Member,

More information

About Doppler-Fizeau effect on radiated noise from a rotating source in cavitation tunnel

About Doppler-Fizeau effect on radiated noise from a rotating source in cavitation tunnel PROCEEDINGS of the 22 nd International Congress on Acoustics Signal Processing in Acoustics (others): Paper ICA2016-111 About Doppler-Fizeau effect on radiated noise from a rotating source in cavitation

More information