Modular Q-learning based multi-agent cooperation for robot soccer
|
|
- Jason Blankenship
- 5 years ago
- Views:
Transcription
1 Robotics and Autonomous Systems 35 (2001) Modular Q-learning based multi-agent cooperation for robot soccer Kui-Hong Park, Yong-Jae Kim, Jong-Hwan Kim Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST), Kusong-dong, Yusong-gu, Taejon-shi , South Korea Received 8 August 2000; received in revised form 12 February 2001 Communicated by F.C.A. Groen Abstract In a multi-agent system, action selection is important for the cooperation and coordination among agents. As the environment is dynamic and complex, modular Q-learning, which is one of the reinforcement learning schemes, is employed in assigning a proper action to an agent in the multi-agent system. The architecture of modular Q-learning consists of learning modules and a mediator module. The mediator module of the modular Q-learning system selects a proper action for the agent based on the Q-value obtained from each learning module. To obtain better performance, along with the Q-value, the mediator module also considers the state information in the action selection process. A uni-vector field is used for robot navigation. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots Elsevier Science B.V. All rights reserved. Keywords: Multi-agent system; Robot soccer system; Reinforcement learning; Modular Q-learning; Action selection 1. Introduction It is important that multi-agent systems perform tasks that are complex and difficult. This needs cooperation and coordination among the agents [3,9]. Developing a multi-agent system amounts to the search for a method for implementing an intelligent system composed of multi-agents, with independent motion control and cooperation with each other. Multi-agent systems are more flexible and fault tolerant as several simple robot agents are easier to handle and cheaper to build compared to a single Corresponding author. addresses: khpark@vivaldi.kaist.ac.kr, (K.-H. Park), johkim@vivaldi.kaist.ac.kr (J.-H. Kim). powerful robot which can carry out different tasks [7]. From the standpoint of multi-agent systems, robot soccer is a good example of the problems in real world which can be moderately modeled. The soccer game is different from other multi-agent systems in that the robots of one team have to cooperate while facing competition with the opponent team. The cooperative and competitive strategies used play a major role in a robot soccer system [10]. The related research issues are quite wide and they are associated with the hardware configuration, software implementation, agent/robot communication, sensor fusion and learning, to mention a few. The action of the robot is usually selected by considering some conditions in the robot soccer /01/$ see front matter 2001 Elsevier Science B.V. All rights reserved. PII: S (01)
2 110 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) environment [12]. However, it is not possible to describe all the situations of the robot soccer game by some condition statements. Moreover, as the environment under consideration is dynamic and complex, reinforcement learning should be employed for the selection of the proper action. In reality, it is very difficult to get the model of the robot soccer game. The agent learns its own action by reinforcement learning [1,2]. Reinforcement learning is the problem faced by an agent that learns its behavior through trial-and-error interactions with a dynamic environment [5,6]. The agent only knows the possible states and actions, not the transition probabilities or the reward structure [11]. Among the reinforcement learning methods, Q-learning can be used in the reinforcement scheme as it is applicable where no model of the environment is available [8,16]. In this paper, modular Q-learning is applied to improve the performance of the team playing in NaroSot (Nano-Robot World Cup Soccer Tournament) category of FIRA (Federation of International Robot Soccer Association; where five robots of size 4 cm 4cm 5.5 cm form a team. Modular Q-learning is one of the reinforcement learning schemes, where the mediator module selects the proper action of a robot based on the Q-value obtained from each learning module. When selecting the proper action, state information such as the distance between the ball and the robot, and the angle between the robot heading angle and the desired angle is also considered in the mediator module along with the Q-value to improve the learning performance. The concept of coupled agents is proposed to resolve conflicts between robots when the ball is located at the boundary region. A uni-vector field method is used for the navigation of the robot. In the robot soccer environment, the effectiveness and applicability of modular Q-learning and the uni-vector field method are verified by real experiments using five micro-robots of the team Y2K2 ranked as a runner-up at the FIRA Robot World Cup Brazil 99. Section 2 describes the robot soccer system, the structure of the robot, the uni-vector field for navigation and basic actions, and robot soccer strategies. This is followed by modular Q-learning and implementation of modular Q-learning for robot soccer in Section 3. The experimental results are presented in Section 4 and concluding remarks are given in Section Robot soccer system 2.1. NaroSot robot soccer system The micro-robot soccer system which comprises of robots, overhead vision system and a host computer is being used as a practical test bed to develop multi-agent systems and multiple robot systems. The complexity of the robot soccer system comes from the cooperation with the home team robots, the competition with opponent team robots and the fast and precise control of each robot while tracking the ball which is the passive constituent of the dynamic environment. We will now describe the NaroSot (Nano-Robot World Cup Soccer Tournament) system, which is one of the categories of the FIRA games. In the NaroSot category, each team has five robots of size 4 cm 4cm 5.5 cm. The pitch is 150 cm 90 cm in size and a ping-pong ball is used. Fig. 1 shows NaroSot robots and a ping-pong ball in the playground. Due to the size limitation, encoders are not used and only vision information is used as feedback. Hence, precise and fast robot control is difficult. The host computer receives the vision signals and uses this to compute the strategy routine and command velocity. The command velocity is then sent to the robots. The strategy routine is to select a proper action for each robot considering the game situation. The robot receives the velocity data sent from the host computer through the RF (radio frequency) transmitter and controls the motor velocity using the command data. The developed robots have two centrally aligned wheels which are easier to control. The width D between the two wheels of the robot is 3.5 cm and the radius R of the wheel is 1.0 cm. Each robot is composed Fig. 1. The NaroSot robots.
3 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) The velocity vector S and the posture vector P are associated with robot kinematics as follows: ẋ c cos θ c 0 cos θ c 0 [ ] P υ = ẏ c = sin θ c 0 S = sin θ c 0. ω θ c (3) 2.2. Uni-vector field navigation Fig. 2. Robot modeling. of four parts; a micro-controller part, an RF communication module, two DC motors with motor driving chips and a power supply unit. The micro-controller PIC16C73A is used for processing the command data and for computing the motor control using the two PWM signals. The RF module is used for communication between the host computer and the robot. The motors have a 6:1 gear ratio without encoders. 9.6 V rechargeable cells are used as power supply and a regulator is used for the logic power supply. Two wheeled mobile robots are considered under the assumptions of non-slipping and pure rolling [4]. Its kinematics can be derived using Fig. 2, where X, Y are the global coordinates. Posture P and position p c of the robot are defined as P = x c y c θ c, p c = [ xc y c ], (1) where (x c, y c ) is the position of the robot center, and θ c the heading angle of the robot with respect to global coordinates. Velocity vector S is defined as follows: [ ] υ S = = ω V R + V L 2 V R V L D = D D [ VL V R ], where υ is the translational velocity of the robot center, ω the rotational velocity with respect to the robot center, V L the left wheel velocity and V R the right wheel velocity. The translational velocity and rotational velocity are obtained from the two wheel velocities. (2) Fig. 3 shows the proposed uni-vector field, where the tiny circles with small dash attached to it denote the robot heading direction. The tiny circle is meant to represent the robot position and the straight line attached to it represents its heading directions [13]. A slightly bigger version of the same symbol is used in the figure to represent the initial position of each of the five robots. A vector field (x, y) at position p is defined as F(p) or F(x, y). It is assumed that the magnitude of the vector field is unity and is the same at all points [14]. The vector field at robot position p is generated by F(p) = pg nφ with φ = pr pg, (4) where n is a positive constant. The larger n is, the smaller the F(p) is at the same robot position. Thus, if n increases, the uni-vector field spreads out to a larger area, making the path to be traversed by the robot in reaching its goal larger. The shape of the field and the turning motion of the robots change according to the parameter n and the length of the line gr. The proposed uni-vector field method is based on (4), through which the vector field at all points can be obtained. In Fig. 3, g represents the target position of the robots. A dummy point r is used for deriving the vector field. The dummy point r is selected heuristically close to the goal point g. In practical applications, the point g will be the position of the ball. The following relationships are used to reduce the error in angle between the robots and the field vectors: ω = K P θ e + K D θ e, θ e = F(p) θ c, θ e = dθ (5) e dt, where F(p) is the vector field at position p with unit magnitude, θ e the error in angle between the robot
4 112 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 3. Uni-vector field method. heading and the field vector direction, θ e the derivative of θ e, K P the proportional feedback gain, and K D the time derivative feedback gain The translational velocity υ is constant. If υ = 0, the robot s heading angle will be towards the direction of F(p) without any changes in position. As indicated by (5), the robot motion is controlled through its right and left wheel velocities which are functions of time: V R = V C + K P θ e + K D θ e, V L = V C K P θ e K D θ e, (6) where V C is the constant robot center velocity. The robot s vector field will be oriented towards the target position and the associated angle of the robot motion is as shown in Fig Implementation of modular Q-learning 3.1. Modular Q-learning Q-learning is a recently explored reinforcement learning algorithm that does not need a model for its application and can be used on-line. Q-learning algorithms store the expected reinforcement value associated with each situation action pair, usually in a look-up table. However, in applying Q-learning to the multi-agent system, there are some difficult problems because the dimension of state space for each learning agent grows exponentially, the power being proportional to the number of agents. For example, considering two agents being engaged in a joint task. Here a joint task implies two agents working together to find an optimal method to kick the ball. Assume that for a single agent 10 3 states are needed for learning, then in the case of the joint task, mentioned above, the total number of states of the agent will grow to 10 6.Asactions are needed in every state, it needs more memory space for multi-agent learning. Such an application of Q-learning to the kind of multi-agent learning problems results in the explosion of the state and memory space. To overcome this problem, modular Q-learning is employed. Fig. 4 shows the architecture of modular Q-learning [15]. The architecture consists of learning modules which amount to the number of agents involved in the task and a mediator module. Each agent in the learning module carries out Q-learning in the environment. In the learning module, learning concentrates on a single agent and the learning of other agents is not considered. To complete the global goal, it needs a mediator module to arbitrate the results of the learning modules. The mediator module makes the final decision and selects the most suitable action based on the Q-value
5 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 4. Modular Q-learning architecture. received from each learning modules. In [15], the mediator module makes this selection by considering the highest Q-value received from the learning modules. This selection method is called greatest mass merging strategy. However, in real experimental environment, convergence to the optimal Q-value during finite iteration is not often possible. So, it is desirable to select the most suitable action by considering an appropriate function which is calculated by using the Q-value and the state information. In this paper, the following function is used to make the final decision in the mediator module: arg max a f(q i (s i, a), θ i,d i ), (7) where a is the action of the agent and i the number of learning modules. The Q-value, Q i is obtained from the learning module and θ i is calculated as (90 θ e ). If Robot 1 and Robot 2 were to form a coupled agent, then considering d 1 as the distance between Robot 1 and the ball, and d 2 as the distance between Robot 2 and the ball, d i for Robot 1 is computed as d i = d 2 d 1. It should be noted that θ i and d i are considered in the mediator module to select the final action. In robot soccer game, robots play the role of attackers, defenders and goalie. In the robot soccer system being implemented in the NaroSot category, there are two attackers, two defenders and a goalie. All attackers and defenders have only two actions either to shoot or to follow the uni-vector field. In the uni-vector field, the target point is the position of the ball. The action selection layer, as a coordinator, selects the shoot action when it is in a good position to do so. Under normal conditions, robots follow the uni-vector field. The robot following the uni-vector field selects the shoot action when its longitudinal position is within the boundary of the ball. In shoot action, the target where the robot kicks the ball to is the center of the opponent goal area. The velocity of a robot in shoot action is faster than its velocity while following the uni-vector field. Goalie has its own actions within the goal area for defending a goal. The role allocation layer, as a higher level, selects the role of a robot according to the situation. The implemented robot soccer system uses a relative fixed role allocation scheme. It is a (1 goalie, 2 defenders, and 2 attackers) formation strategy. The term relative fixed role allocation is being used because the zones of each robot though fixed in most cases would, in some specific situations, be changed for a short interval. In the zone defense scheme, there would be a conflict among the home robots near the boundary region. This is classified as a non-blocked situation. On the other hand, a blocked situation is when the home robot is blocked by opponent robots. With respect to the ball position, there are three cases in the non-blocked situation. Fig. 5 shows the three boundary regions 3.2. Implementation Modular Q-learning is employed in the robot soccer system to improve cooperation among the robots in the team so as to carry out zone defense strategies. Fig. 5. The three learning regions in the non-blocked situation.
6 114 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) The angle error between the robot heading direction and the ball three levels. 4. A binary flag: 1 ifr x >B x, and (B y 4 3 B w)<r y <(B y B w), 0 otherwise. Fig. 6. The coupled agent. corresponding to the three cases in the non-blocked situation. To apply modular Q-learning to the robot soccer system, the concept of coupled agents is introduced as shown in Fig. 6. When the ball is within the boundary region of two robots, both the robots will be in a position to kick the ball. This may lead to their collision. To solve this problem, the concept of coupled agents composed of these two robots is proposed. For example, if the ball is located in Region 1, Attacker 1 and Defender 1 are considered as a coupled agent. The mediator module will assign an action to each robot in a coupled agent based on (7). The action is either to kick the ball or to maintain its current position. For learning, the initial Q-values are randomly determined in the range [0, 0.02]. The learning rate α is set to 0.9 for fast convergence during the learning process. The discount factor γ is set to 0.3, a relatively low value to reduce the possible noise effect. The noise effect arises because γ gets multiplied by the maximum Q-value of the next state. The reason for choosing the low discount factor γ is that in real experiments it is possible for the robot to kick the ball unexpectedly. In such cases, the Q-value is updated as a reward. This is not desirable for precise learning. Considering this situation, as a compensation, the discount factor γ should be set to a low value Non-blocked situation First, consider Region 1, where a state in the learning module of an individual agent consists of five components: 1. The robot location two levels (either Area 1 or Area 3 occupied by the robot). 2. The difference in distance d four levels. 5. A binary flag: 1 if the other robot in a coupled agent is to kick the ball, 0 otherwise. Considering d 1, the distance between Robot 1 and the ball, and d 3, the distance between Robot 3 and the ball, d is computed as the difference in distance between d 1 and d 3, i.e., d = d 1 d 3. So, d is in fact the negative of d i in Eq. (7). Considering the learning module of Robot 1, R x (R y )isthex (Y) coordinate of the robot, B x (B y )isthex (Y) coordinate of the ball, and B w is the width of the ball. From the learning module of Robot 3, d = d 3 d 1. Fig. 7(a) (e) shows each component of the state for learning in the non-blocked situation. There are 96 states in each of the three cases in the non-blocked situation. For example, in Region 1, Robot 1 and Robot 3 form the coupled agent and each robot has a learning module which has 96 states. The mediator module selects the final action of each robot of the coupled agent. In Region 2, the first component of the state is either Area 1 or Area 2, the second component of the state is d = d 1 d 2 from the viewpoint of Robot 1 and d = d 2 d 1 from the viewpoint of Robot 2, where d 1 is the distance between Robot 1 and the ball and d 2 the distance between Robot 2 and the ball. The states of Region 3 are similar to those of Region 1, as it is symmetric. Table 1 shows the list of actions which the coupled agent will select in Regions 1 and 2. For example, if Action 1 is selected in Region 1, Robot 1 shall be Attacker 1 and Robot 3 shall be Defender 1. When the ball moves to Region 3, Robot 2 becomes Attacker 2 and Robot 4 assumes defense and becomes Defender 2. The reward is assigned as follows: r = a b +, (8) t 1 + t const1 t 2 + t const2
7 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 7. State components in Region 1. Table 1 Actions of the coupled agent in the non-blocked situation Region 1 Region 2 Actions Robot 1 Robot 3 Actions Robot 1 Robot 2 1 Attacker 1 Defender 1 1 Attacker 1 Attacker 2 2 Attacker 1 Kicker 2 Attacker 1 Kicker 3 Kicker Defender 1 3 Kicker Attacker 2
8 116 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 8. Two cases of the blocked situation. where t 1 is the time taken by the kicking robot to kick the ball and t 2 the time taken by the ball to reach the boundary of any other robot which is situated outside the learning region. a, b, t const1 and t const2 are constants. t const1 and t const2 are used for prohibiting the reward value from increasing to infinite value Blocked situation The states and the reward in the blocked situation are similar to those of the non-blocked situation. In the zone defense scheme, there are two cases where the attacker cannot execute its own role because of the blocking. Fig. 8 shows the cases in the blocked situation. Consider Region 1, where a state in the learning module consists of four components as follows: 1. The robot location two levels (either Area 2 or Area 3). 2. A binary flag: { 1 ifry <B y, 0 otherwise. 3. Blocking flag, difference in distance level (four levels), angle error level (three levels) 13 levels. 4. A binary flag: 1 if the other robot in a coupled agent is to kick the ball, 0 otherwise, where R y is the Y coordinate of the position of the blocked robot and B y the Y coordinate of the ball. In Fig. 9. State components in the blocked situation.
9 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Table 2 Actions of the coupled agent in the blocked situation Region 1 Region 2 Actions Robot 2 Robot 3 Actions Robot 1 Robot 4 1 Attacker 2 Defender 1 1 Attacker 1 Defender 2 2 Attacker 2 Kicker 2 Attacker 1 Kicker 3 Kicker Defender 1 3 Kicker Defender 2 Region 2, the first component of the state is either Area 1 or Area 4. Fig. 9(a) (d) shows each component of the state for learning in the blocked situation. There are 104 states in each of the two cases in the blocked situation. Table 2 shows actions which the coupled agent can take in Regions 1 and 2. In Region 2, the action is similar to that of coupled agent in Region 1. The reward is assigned as a r =, (9) t + t const1 where t is the time taken by the kicking robot to kick the ball, and a and t const1 are constants. 4. Experimental results The mediator module is working when both of the robots in a coupled agent tend to kick the ball. The action of the coupled agent is selected by considering the Q-value obtained from the learning modules and the state information. The angle error and the distance to the other robot in the coupled agent are used as the state information. In the selection equation (7) of the mediator module, f (Q i (s, a), θ i, d i )is given by f(q i (s i, a), θ i,d i ) = η Qi Q i (s i,a)+ η θi θ i + η di d i, (10) where η Qi, η θi and η di are constant coefficients. In the experiment, the values, 0.5, 0.3 and 0.2, respectively, were used. The mediator module selects the final action of each robot of the coupled agent based on the modified Q-value and the state information. The sampling time used in the real robot soccer system is 18 ms Non-blocked situation In Eq. (8), for the reward a = 12,000, b = 6, t const1 = 18 and t const2 = 3 were used, where a is the time interval selected for the kicking robot to kick the ball and is limited to the millisecond range. b is obtained by the experiment and t const2 is determined heuristically. In Region 1 of the non-blocked situation, it took 280 trials to obtain the Q-value which is being considered as a suboptimal value. In Region 2, it took 210 trials. The Q-values of the third region were the same as those of the first region. Fig. 10(a) shows the trajectories of the two robots when the ball is in Region 1. After the learning phase, Robot 3 took up the task of kicking the ball. It took s (155 steps) to kick the ball. Had Robot 1 assumed this task, it would have taken s (167 steps) to do so. Instead Robot 1 took up position in the defense zone left vacant by Robot 3. In Fig. 10(a), initial positions of the ball, Robot 1 and Robot 3 were (46.50 cm, cm), (86.05 cm, cm) and (22.21 cm, cm), respectively. Fig. 10(b) shows the trajectories of the two robots when the ball is in Region 2. Robot 1 kicked the ball after learning. It took s (94 steps) to kick the ball. If the other robot in the coupled agent were to kick the ball, it would have taken s (103 steps). Initial positions of the ball, Robot 1 and Robot 2 were (97.56 cm, cm), (80.99 cm, cm) and (72.02 cm, cm), respectively Blocked situation For the reward in the blocked situation, a = 20,000 and t const1 = 18 were used. These values were determined similar to the non-blocked situation.
10 118 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 10. Non-blocked situation: (a) Robot 3 kicked the ball after the learning phase in Region 1; (b) Robot 1 kicked the ball after the learning phase in Region 2. Three hundred trials were needed for convergence in the first region of the blocked situation. The Q-values of the second region were the same as those of the first region because they are symmetric. Fig. 11 shows the trajectories of the two robots in the blocked situation in Region 1. Robot 2 assisted the blocked Robot 1 and it took s to kick the ball (if Robot 3 was a kicker it would have taken s). Initial positions of the ball, Robot 2 and Robot 3 Fig. 11. Robot 2 assists Robot 1 after learning in the blocked situation. were (91.83 cm, cm), (70.78 cm, cm) and (21.74 cm, cm), respectively Effect of the modified Q-value in the mediator module In the above results, the mediator module did not use any of the state informations to determine the action of the coupled agent. The effectiveness of the modified Q-value in the mediator module which makes the final selection of the coupled agent are brought out in the real experiment. Fig. 12(a) shows the trajectories of the two robots in the non-blocked situation in Region 2. In this case the mediator module arbitrates the action of the two robots. As shown in Fig. 12(a), Robot 2 kicked the ball. It took 100 steps (1.800 s) to do so. Considering the Q-value of the learning modules, the Q-value of the kick action was larger than that of the other actions in each robot. The mediator module selects the final action which has a larger Q-value. Initial positions of the ball, Robot 1 and Robot 2 were (95.26 cm, cm), (76.62 cm, cm) and (74.01 cm, cm), respectively. Fig. 12(b) shows the same situation as that of Fig. 12(a). In this case, the mediator module considers the Q-value received from each learning module and the state information described in (7). In this situation,
11 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 12. Effect of the modified Q-value: (a) Robot 2 kicked ball after learning: with only Q-value; (b) Robot 1 kicked ball after learning: with Q-value and state information. Robot 1 kicked the ball. It took 83 steps (1.494 s) for this action. It may be noted that the time for Robot 1 to kick the ball (Fig. 12(b)) is shorter than that of the time taken by Robot 2 to do so (Fig. 12(a)) Boundary region of four robots As shown in Fig. 13, in the non-blocked situation, it is possible for the ball to be in the common region of the three regions. In this case, the coupled agent includes four robots and the problem is how to select the right robot for the kicking action. All four robots will have two Q-values which are obtained in three regions in the non-blocked situation. Q 11 and Q 13 are determined in Region 1 and Q 21 and Q 22 are obtained in Region 2 in the non-blocked situation. Q 32, Q 34, Q 43 and Q 44 are decided in Region 3 and Region 4 (Fig. 14). Q-values of Regions 3 and 4 are the same as those of Regions 1 and 2. In Regions 1 3 in the non-blocked situation, only two robots need be chosen as the coupled agent for learning. In the situation now considered, four robots form a coupled agent and kicking has to be assigned to one of them. The mediator module arbitrates the robots when Q-value of each robot for kick action is larger than that of the Q-value for retaining its position. In such a situation, the average of the Q-values of each robot in the two regions considered becomes the deciding factor in the mediator module. Together with Fig. 13. The ball is located in boundary region. Fig. 14. Q-value of the four robots.
12 120 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Fig. 15. After learning phase in the boundary region: (a) Robot 1 kicked the ball: with only Q-value; (b) Robot 3 kicked the ball: with Q-value and state information. this average value, the state information in Eq. (7) is also being considered for assigning the kick action. It may be noted that in this situation, d i in Eq. (7) is computed as the sum of the distances between the ith robot and each of the remaining robots in the coupled agent. Fig. 15(a) shows the trajectories of the four robot when the ball was in the boundary region of the four robots. Considering only the Q-value information received from the learning modules, the kick action was assigned to Robot 1 and it took s (218 steps) to kick the ball. Initial positions of the ball, Robot 1, Robot 2, Robot 3 and Robot 4 were (47.86 cm, cm), (83.50 cm, cm), (88.35 cm, cm), (24.62 cm, cm) and (23.92 cm, cm), respectively. The 1, 2, 3, 4 and B in Fig. 15 denote the initial positions of Robots 1, 2, 3, 4 and ball, respectively. These symbols are used in all other figures. In this initial position, the Q-value of Robot 1 as kicker is greater than its Q-value as Fig. 16. Other two cases in the boundary region: (a) Attacker 2 kicked the ball; (b) Defender 2 kicked the ball.
13 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Attacker 1. For Robot 3 also, its Q-value for kick action is seen to be greater than its Q-value as Defender 1. In the case of Robot 2, the Q-value for the kick action is seen to be smaller than its Q-value as Attacker 2 and for Robot 4, its Q-value as a kicker is smaller than its Q-value as Defender 2. Between Robot 1 and Robot 3, Robot 1 has a higher Q-value as a kicker. However, the choice of assigning the kick action is now decided by the mediator module using the modified Q-value which is obtained by considering the state information. Then Robot 3 which has a greater modified Q-value qualifies to be a kicker. Fig. 15(b) shows the trajectories of the four robots when Robot 3 kicked the ball. It took s (123 steps) to kick the ball. The time that would have been taken by Robot 2 and Robot 4 to kick the ball was also investigated. These situations are depicted in Fig. 16(a) and (b), respectively. It took s (143 steps) for Robot 2 and s (132 steps) for Robot 4 to kick the ball. Thus, the experiment asserted the theory that the time taken for kick action by a robot that is assigned this task using the modified Q-value method is the least. 5. Conclusions This paper proposed an action selection mechanism among the robots in a robot soccer game. The action selection problem of the zone defense scheme is divided into two situations: non-blocked case and blocked case by its opponent. Non-blocked case is the situation of conflict among the home robots near the boundary region. Blocked case corresponds to the situation of a home robot being blocked by opponent robots. The modular Q-learning architecture was used to solve the action selection problem which specifically selects the robot that needs the least time to kick the ball and assign this task to it. The concept of the coupled agent was used to resolve a conflict in action selection among robots. A uni-vector method was employed for navigation of the robots. The mediator module selects the final action of the coupled agent by considering the Q-value received from the learning modules and the state information. The effectiveness of the scheme was demonstrated through real robot soccer experiments. References [1] R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction, Bradford Books/MIT Press, Cambridge, MA, [2] C.J.C.H. Watkins, P. Dayan, Q-learning, Machine Learning 8 (1992) [3] C.R. Kube, H. Zhang, Collective robotics: From social insects to robot, Adaptive Behavior 2 (2) (1993) [4] G. Campion, G. Bastin, D Andréa-Novel, Structural properties and classification of kinematic and dynamic models of wheeled mobile robots, IEEE Transactions on Robotics and Automation 12 (1) (1996) [5] L.P. Kaelbling, M.L. Littman, A.W. Moore, Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4 (1996) [6] T.W. Sandholm, R.H. Crites, Multiagent reinforcement learning in the Iterated Prisoner s Dilemma, Biosystems 37 (1996) [7] H.-S. Shim, H.-S. Kim, M.-J. Jung, I.-H. Choi, J.-H. Kim, J.-O. Kim, Designing distributed control architecture for cooperative multi-agent system and its real-time application to soccer robot, Robotics and Autonomous Systems 21 (2) (1997) [8] P.V.C. Caironi, M. Dorigo, Training and delay reinforcements in Q-learning agents, International Journal of Intelligent Systems 12 (10) (1997) [9] L.E. Parker, Alliance: An architecture for fault tolerant multirobot cooperation, IEEE Transactions on Robotics and Automation 14 (2) (1998) [10] J.-H. Kim, H.-S. Shim, H.-S. Kim, M.-J. Jung, I.-H. Choi, K.-O. Kim, A cooperative multi-agent system and its real time application to robot soccer, in: Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN, 1996, pp [11] C. Boutilier, Planning, learning and coordination in multiagent decision processes, in: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge, Netherlands, [12] S.H. Lee, J. Bautista, Motion control for micro-robots playing soccer games, in: Proceedings of the IEEE International Conference on Robotics and Automation, Leuven, Belgium, 1998, pp [13] J.-H. Kim, K.-C. Kim, D.-H. Kim, Y.-J. Kim, P. Vadakkepat, Path planning and role selection mechanism for soccer robots, in: Proceedings of the IEEE International Conference on Robotics and Automation, Leuven, Belgium, 1998, pp [14] Y.-J. Kim, D.-H. Kim, J.-H. Kim, Evolutionary programming-based vector field method for fast mobile robot navigation, in: Proceedings of the Second Asia Pacific Conference on Simulations, Evolutions and Learning, [15] N. Ono, K. Fukumoto, Multi-agent reinforcement learning: A modular approach, in: Proceedings of the Second International Conference on Multi-agent Systems, AAAI Press, 1996, pp [16] G.A. Rummery, Problem solving with reinforcement learning, Ph.D. Thesis, Cambridge University, Cambridge, UK, 1995.
14 122 K.-H. Park et al. / Robotics and Autonomous Systems 35 (2001) Kui-Hong Park received his B.S. degree in Electrical Engineering from Hanyang University, Seoul, South Korea, in 1997 and M.S. degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST), Taejon, South Korea, in 1998, respectively. He is currently working towards the Ph.D. degree in Electrical Engineering at KAIST. His main research interests include multi-agent systems and machine intelligence. Mr. Park received the second awards both in the 1998 Nano-Robot World Cup Soccer Tournament (NaroSot) in France and in the NaroSot 99 in Brazil. Yong-Jae Kim received his B.S. and M.S. degrees in Electrical Engineering from Korea Advanced Institute of Science and Technology, Taejon, South Korea, in 1996 and 1998, respectively. He is currently working towards the Ph.D. degree in Electrical Engineering at this institute. His research interests include motion planning of mobile systems, and machine intelligence. Mr. Kim is the recipient of the third award at the MiroSot 97 and the first award at the Robot Soccer American Cup in Jong-Hwan Kim received his B.S., M.S., and Ph.D. degrees in Electronics Engineering from Seoul National University, South Korea, in 1981, 1983, and 1987, respectively. Since 1988, he has been with the Department of Electrical Engineering at the Korea Advanced Institute of Science and Technology, where he is currently a Professor. He was a Visiting Scholar at Purdue University from September 1992 to August His research interests are in the areas of evolutionary multi-agent robotic systems. He is the Associate Editor of the IEEE Transactions on Evolutionary Computation, and of the International Journal of Intelligent and Fuzzy Systems. He is one of the co-founders of Asia Pacific Conference on Simulated Evolution and Learning. He is the General Chair of the Congress on Evolutionary Computation His name is included in the Barons 500 Leaders for the New Century as the Founder of FIRA (Federation of International Robot Soccer Association) and of IROC for Robot Olympiad. He is now serving FIRA and IROC as President. He was the Guest Editor of the special issue on MiroSot 96 of the journal Robotics and Autonomous Systems and on Soccer Robotics of the Journal of Intelligent Automation and Soft Computing. Dr. Kim is the recipient of the 1988 Choongang Young Investigator Award from Choongang Memorial Foundation, the LG YonAm Foundation Research Fellowship in 1992, the Korean Presidential Award in 1997, and the SeoAm Foundation Research Fellowship in 1999.
Multi-Agent Control Structure for a Vision Based Robot Soccer System
Multi- Control Structure for a Vision Based Robot Soccer System Yangmin Li, Wai Ip Lei, and Xiaoshan Li Department of Electromechanical Engineering Faculty of Science and Technology University of Macau
More informationSimple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots
Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots Gregor Novak 1 and Martin Seyr 2 1 Vienna University of Technology, Vienna, Austria novak@bluetechnix.at 2 Institute
More informationRapid Control Prototyping for Robot Soccer
Proceedings of the 17th World Congress The International Federation of Automatic Control Rapid Control Prototyping for Robot Soccer Junwon Jang Soohee Han Hanjun Kim Choon Ki Ahn School of Electrical Engr.
More informationAvailable online at ScienceDirect. Procedia Computer Science 56 (2015 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 56 (2015 ) 538 543 International Workshop on Communication for Humans, Agents, Robots, Machines and Sensors (HARMS 2015)
More informationStrategy for Collaboration in Robot Soccer
Strategy for Collaboration in Robot Soccer Sng H.L. 1, G. Sen Gupta 1 and C.H. Messom 2 1 Singapore Polytechnic, 500 Dover Road, Singapore {snghl, SenGupta }@sp.edu.sg 1 Massey University, Auckland, New
More informationA Posture Control for Two Wheeled Mobile Robots
Transactions on Control, Automation and Systems Engineering Vol., No. 3, September, A Posture Control for Two Wheeled Mobile Robots Hyun-Sik Shim and Yoon-Gyeoung Sung Abstract In this paper, a posture
More informationAutonomous Stair Climbing Algorithm for a Small Four-Tracked Robot
Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot Quy-Hung Vu, Byeong-Sang Kim, Jae-Bok Song Korea University 1 Anam-dong, Seongbuk-gu, Seoul, Korea vuquyhungbk@yahoo.com, lovidia@korea.ac.kr,
More informationOnline Evolution for Cooperative Behavior in Group Robot Systems
282 International Dong-Wook Journal of Lee, Control, Sang-Wook Automation, Seo, and Systems, Kwee-Bo vol. Sim 6, no. 2, pp. 282-287, April 2008 Online Evolution for Cooperative Behavior in Group Robot
More informationHierarchical Controller for Robotic Soccer
Hierarchical Controller for Robotic Soccer Byron Knoll Cognitive Systems 402 April 13, 2008 ABSTRACT RoboCup is an initiative aimed at advancing Artificial Intelligence (AI) and robotics research. This
More informationMotion Control of a Three Active Wheeled Mobile Robot and Collision-Free Human Following Navigation in Outdoor Environment
Proceedings of the International MultiConference of Engineers and Computer Scientists 2016 Vol I,, March 16-18, 2016, Hong Kong Motion Control of a Three Active Wheeled Mobile Robot and Collision-Free
More informationBehavior generation for a mobile robot based on the adaptive fitness function
Robotics and Autonomous Systems 40 (2002) 69 77 Behavior generation for a mobile robot based on the adaptive fitness function Eiji Uchibe a,, Masakazu Yanase b, Minoru Asada c a Human Information Science
More informationCOOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS
COOPERATIVE STRATEGY BASED ON ADAPTIVE Q- LEARNING FOR ROBOT SOCCER SYSTEMS Soft Computing Alfonso Martínez del Hoyo Canterla 1 Table of contents 1. Introduction... 3 2. Cooperative strategy design...
More informationRobo-Erectus Jr-2013 KidSize Team Description Paper.
Robo-Erectus Jr-2013 KidSize Team Description Paper. Buck Sin Ng, Carlos A. Acosta Calderon and Changjiu Zhou. Advanced Robotics and Intelligent Control Centre, Singapore Polytechnic, 500 Dover Road, 139651,
More informationRandomized Motion Planning for Groups of Nonholonomic Robots
Randomized Motion Planning for Groups of Nonholonomic Robots Christopher M Clark chrisc@sun-valleystanfordedu Stephen Rock rock@sun-valleystanfordedu Department of Aeronautics & Astronautics Stanford University
More informationQ Learning Behavior on Autonomous Navigation of Physical Robot
The 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI 211) Nov. 23-26, 211 in Songdo ConventiA, Incheon, Korea Q Learning Behavior on Autonomous Navigation of Physical Robot
More informationCMDragons 2009 Team Description
CMDragons 2009 Team Description Stefan Zickler, Michael Licitra, Joydeep Biswas, and Manuela Veloso Carnegie Mellon University {szickler,mmv}@cs.cmu.edu {mlicitra,joydeep}@andrew.cmu.edu Abstract. In this
More informationMulti-robot Formation Control Based on Leader-follower Method
Journal of Computers Vol. 29 No. 2, 2018, pp. 233-240 doi:10.3966/199115992018042902022 Multi-robot Formation Control Based on Leader-follower Method Xibao Wu 1*, Wenbai Chen 1, Fangfang Ji 1, Jixing Ye
More informationLearning and Using Models of Kicking Motions for Legged Robots
Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract
More informationCOMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION
COMPACT FUZZY Q LEARNING FOR AUTONOMOUS MOBILE ROBOT NAVIGATION Handy Wicaksono, Khairul Anam 2, Prihastono 3, Indra Adjie Sulistijono 4, Son Kuswadi 5 Department of Electrical Engineering, Petra Christian
More informationNational University of Singapore
National University of Singapore Department of Electrical and Computer Engineering EE4306 Distributed Autonomous obotic Systems 1. Objectives...1 2. Equipment...1 3. Preparation...1 4. Introduction...1
More informationMulti-Platform Soccer Robot Development System
Multi-Platform Soccer Robot Development System Hui Wang, Han Wang, Chunmiao Wang, William Y. C. Soh Division of Control & Instrumentation, School of EEE Nanyang Technological University Nanyang Avenue,
More informationRobocup Electrical Team 2006 Description Paper
Robocup Electrical Team 2006 Description Paper Name: Strive2006 (Shanghai University, P.R.China) Address: Box.3#,No.149,Yanchang load,shanghai, 200072 Email: wanmic@163.com Homepage: robot.ccshu.org Abstract:
More informationNAVIGATION OF MOBILE ROBOT USING THE PSO PARTICLE SWARM OPTIMIZATION
Journal of Academic and Applied Studies (JAAS) Vol. 2(1) Jan 2012, pp. 32-38 Available online @ www.academians.org ISSN1925-931X NAVIGATION OF MOBILE ROBOT USING THE PSO PARTICLE SWARM OPTIMIZATION Sedigheh
More informationRoboCup. Presented by Shane Murphy April 24, 2003
RoboCup Presented by Shane Murphy April 24, 2003 RoboCup: : Today and Tomorrow What we have learned Authors Minoru Asada (Osaka University, Japan), Hiroaki Kitano (Sony CS Labs, Japan), Itsuki Noda (Electrotechnical(
More informationObstacle Avoidance Functions on Robot Mirosot in The Departement of Informatics of UPN Veteran Yogyakarta
Proceeding International Conference on Electrical Engineering, Computer Science Informatics (EECSI 2015), Palembang, Indonesia, 19-20 August 2015 Obstacle Avoidance Functions on Robot Mirosot in Departement
More informationDecision Science Letters
Decision Science Letters 3 (2014) 121 130 Contents lists available at GrowingScience Decision Science Letters homepage: www.growingscience.com/dsl A new effective algorithm for on-line robot motion planning
More informationFuzzy Logic for Behaviour Co-ordination and Multi-Agent Formation in RoboCup
Fuzzy Logic for Behaviour Co-ordination and Multi-Agent Formation in RoboCup Hakan Duman and Huosheng Hu Department of Computer Science University of Essex Wivenhoe Park, Colchester CO4 3SQ United Kingdom
More informationTeam KMUTT: Team Description Paper
Team KMUTT: Team Description Paper Thavida Maneewarn, Xye, Pasan Kulvanit, Sathit Wanitchaikit, Panuvat Sinsaranon, Kawroong Saktaweekulkit, Nattapong Kaewlek Djitt Laowattana King Mongkut s University
More informationROBOTSOCCER. Peter Kopacek
Proceedings of the 17th World Congress The International Federation of Automatic Control ROBOTSOCCER Peter Kopacek Intelligent Handling and Robotics (IHRT),Vienna University of Technology Favoritenstr.
More informationThe Necessity of Average Rewards in Cooperative Multirobot Learning
Carnegie Mellon University Research Showcase @ CMU Institute for Software Research School of Computer Science 2002 The Necessity of Average Rewards in Cooperative Multirobot Learning Poj Tangamchit Carnegie
More informationS.P.Q.R. Legged Team Report from RoboCup 2003
S.P.Q.R. Legged Team Report from RoboCup 2003 L. Iocchi and D. Nardi Dipartimento di Informatica e Sistemistica Universitá di Roma La Sapienza Via Salaria 113-00198 Roma, Italy {iocchi,nardi}@dis.uniroma1.it,
More informationA GAME THEORETIC MODEL OF COOPERATION AND NON-COOPERATION FOR SOCCER PLAYING ROBOTS. M. BaderElDen, E. Badreddin, Y. Kotb, and J.
A GAME THEORETIC MODEL OF COOPERATION AND NON-COOPERATION FOR SOCCER PLAYING ROBOTS M. BaderElDen, E. Badreddin, Y. Kotb, and J. Rüdiger Automation Laboratory, University of Mannheim, 68131 Mannheim, Germany.
More informationCooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution
Cooperative Behavior Acquisition in A Multiple Mobile Robot Environment by Co-evolution Eiji Uchibe, Masateru Nakamura, Minoru Asada Dept. of Adaptive Machine Systems, Graduate School of Eng., Osaka University,
More informationObstacle Avoidance in Collective Robotic Search Using Particle Swarm Optimization
Avoidance in Collective Robotic Search Using Particle Swarm Optimization Lisa L. Smith, Student Member, IEEE, Ganesh K. Venayagamoorthy, Senior Member, IEEE, Phillip G. Holloway Real-Time Power and Intelligent
More informationEstimation of Absolute Positioning of mobile robot using U-SAT
Estimation of Absolute Positioning of mobile robot using U-SAT Su Yong Kim 1, SooHong Park 2 1 Graduate student, Department of Mechanical Engineering, Pusan National University, KumJung Ku, Pusan 609-735,
More informationAdaptive Action Selection without Explicit Communication for Multi-robot Box-pushing
Adaptive Action Selection without Explicit Communication for Multi-robot Box-pushing Seiji Yamada Jun ya Saito CISS, IGSSE, Tokyo Institute of Technology 4259 Nagatsuta, Midori, Yokohama 226-8502, JAPAN
More informationTraffic Control for a Swarm of Robots: Avoiding Target Congestion
Traffic Control for a Swarm of Robots: Avoiding Target Congestion Leandro Soriano Marcolino and Luiz Chaimowicz Abstract One of the main problems in the navigation of robotic swarms is when several robots
More informationSwarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization
Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization Learning to avoid obstacles Outline Problem encoding using GA and ANN Floreano and Mondada
More informationFU-Fighters. The Soccer Robots of Freie Universität Berlin. Why RoboCup? What is RoboCup?
The Soccer Robots of Freie Universität Berlin We have been building autonomous mobile robots since 1998. Our team, composed of students and researchers from the Mathematics and Computer Science Department,
More informationA Lego-Based Soccer-Playing Robot Competition For Teaching Design
Session 2620 A Lego-Based Soccer-Playing Robot Competition For Teaching Design Ronald A. Lessard Norwich University Abstract Course Objectives in the ME382 Instrumentation Laboratory at Norwich University
More informationTraffic Control for a Swarm of Robots: Avoiding Group Conflicts
Traffic Control for a Swarm of Robots: Avoiding Group Conflicts Leandro Soriano Marcolino and Luiz Chaimowicz Abstract A very common problem in the navigation of robotic swarms is when groups of robots
More informationIntroduction to Robotics
Jianwei Zhang zhang@informatik.uni-hamburg.de Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme 14. June 2013 J. Zhang 1 Robot Control
More informationBehaviour Patterns Evolution on Individual and Group Level. Stanislav Slušný, Roman Neruda, Petra Vidnerová. CIMMACS 07, December 14, Tenerife
Behaviour Patterns Evolution on Individual and Group Level Stanislav Slušný, Roman Neruda, Petra Vidnerová Department of Theoretical Computer Science Institute of Computer Science Academy of Science of
More informationAn Intuitional Method for Mobile Robot Path-planning in a Dynamic Environment
An Intuitional Method for Mobile Robot Path-planning in a Dynamic Environment Ching-Chang Wong, Hung-Ren Lai, and Hui-Chieh Hou Department of Electrical Engineering, Tamkang University Tamshui, Taipei
More informationA Fuzzy-Based Approach for Partner Selection in Multi-Agent Systems
University of Wollongong Research Online Faculty of Informatics - Papers Faculty of Informatics 07 A Fuzzy-Based Approach for Partner Selection in Multi-Agent Systems F. Ren University of Wollongong M.
More informationMCT Susanoo Logics 2014 Team Description
MCT Susanoo Logics 2014 Team Description Satoshi Takata, Yuji Horie, Shota Aoki, Kazuhiro Fujiwara, Taihei Degawa Matsue College of Technology 14-4, Nishiikumacho, Matsue-shi, Shimane, 690-8518, Japan
More informationA New Analytical Representation to Robot Path Generation with Collision Avoidance through the Use of the Collision Map
International A New Journal Analytical of Representation Control, Automation, Robot and Path Systems, Generation vol. 4, no. with 1, Collision pp. 77-86, Avoidance February through 006 the Use of 77 A
More informationEvolving High-Dimensional, Adaptive Camera-Based Speed Sensors
In: M.H. Hamza (ed.), Proceedings of the 21st IASTED Conference on Applied Informatics, pp. 1278-128. Held February, 1-1, 2, Insbruck, Austria Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors
More informationLearning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots
Learning Reactive Neurocontrollers using Simulated Annealing for Mobile Robots Philippe Lucidarme, Alain Liégeois LIRMM, University Montpellier II, France, lucidarm@lirmm.fr Abstract This paper presents
More informationHybrid LQG-Neural Controller for Inverted Pendulum System
Hybrid LQG-Neural Controller for Inverted Pendulum System E.S. Sazonov Department of Electrical and Computer Engineering Clarkson University Potsdam, NY 13699-570 USA P. Klinkhachorn and R. L. Klein Lane
More informationLearning Attentive-Depth Switching while Interacting with an Agent
Learning Attentive-Depth Switching while Interacting with an Agent Chyon Hae Kim, Hiroshi Tsujino, and Hiroyuki Nakahara Abstract This paper addresses a learning system design for a robot based on an extended
More informationA Novel Fuzzy Neural Network Based Distance Relaying Scheme
902 IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 15, NO. 3, JULY 2000 A Novel Fuzzy Neural Network Based Distance Relaying Scheme P. K. Dash, A. K. Pradhan, and G. Panda Abstract This paper presents a new
More informationAn Improved Path Planning Method Based on Artificial Potential Field for a Mobile Robot
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No Sofia 015 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-015-0037 An Improved Path Planning Method Based
More informationThe Haptic Impendance Control through Virtual Environment Force Compensation
The Haptic Impendance Control through Virtual Environment Force Compensation OCTAVIAN MELINTE Robotics and Mechatronics Department Institute of Solid Mechanicsof the Romanian Academy ROMANIA octavian.melinte@yahoo.com
More informationOptic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball
Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball Masaki Ogino 1, Masaaki Kikuchi 1, Jun ichiro Ooga 1, Masahiro Aono 1 and Minoru Asada 1,2 1 Dept. of Adaptive Machine
More informationTrajectory Generation for a Mobile Robot by Reinforcement Learning
1 Trajectory Generation for a Mobile Robot by Reinforcement Learning Masaki Shimizu 1, Makoto Fujita 2, and Hiroyuki Miyamoto 3 1 Kyushu Institute of Technology, Kitakyushu, Japan shimizu-masaki@edu.brain.kyutech.ac.jp
More informationAPPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION
APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION Handy Wicaksono 1, Prihastono 2, Khairul Anam 3, Rusdhianto Effendi 4, Indra Adji Sulistijono 5, Son Kuswadi 6, Achmad Jazidie
More informationA Reconfigurable Guidance System
Lecture tes for the Class: Unmanned Aircraft Design, Modeling and Control A Reconfigurable Guidance System Application to Unmanned Aerial Vehicles (UAVs) y b right aileron: a2 right elevator: e 2 rudder:
More informationReinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units
Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units Sromona Chatterjee, Timo Nachstedt, Florentin Wörgötter, Minija Tamosiunaite, Poramate
More informationDEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR
Proceedings of IC-NIDC2009 DEVELOPMENT OF A ROBOID COMPONENT FOR PLAYER/STAGE ROBOT SIMULATOR Jun Won Lim 1, Sanghoon Lee 2,Il Hong Suh 1, and Kyung Jin Kim 3 1 Dept. Of Electronics and Computer Engineering,
More informationMicro Robot Hockey Simulator Game Engine Design
Micro Robot Hockey Simulator Game Engine Design Wayne Y. Chen Experimental Robotics Laboratory School of Engineering Science Simon Fraser University, Burnaby, BC, Canada waynec@fas.sfu.ca Shahram Payandeh
More informationCS594, Section 30682:
CS594, Section 30682: Distributed Intelligence in Autonomous Robotics Spring 2003 Tuesday/Thursday 11:10 12:25 http://www.cs.utk.edu/~parker/courses/cs594-spring03 Instructor: Dr. Lynne E. Parker ½ TA:
More informationCMDragons 2006 Team Description
CMDragons 2006 Team Description James Bruce, Stefan Zickler, Mike Licitra, and Manuela Veloso Carnegie Mellon University Pittsburgh, Pennsylvania, USA {jbruce,szickler,mlicitra,mmv}@cs.cmu.edu Abstract.
More informationAPPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION
APPLICATION OF FUZZY BEHAVIOR COORDINATION AND Q LEARNING IN ROBOT NAVIGATION Handy Wicaksono 1,2, Prihastono 1,3, Khairul Anam 4, Rusdhianto Effendi 2, Indra Adji Sulistijono 5, Son Kuswadi 5, Achmad
More informationParticle Swarm Optimization-Based Consensus Achievement of a Decentralized Sensor Network
, pp.162-166 http://dx.doi.org/10.14257/astl.2013.42.38 Particle Swarm Optimization-Based Consensus Achievement of a Decentralized Sensor Network Hyunseok Kim 1, Jinsul Kim 2 and Seongju Chang 1*, 1 Department
More informationWheeled Mobile Robot Obstacle Avoidance Using Compass and Ultrasonic
Universal Journal of Control and Automation 6(1): 13-18, 2018 DOI: 10.13189/ujca.2018.060102 http://www.hrpub.org Wheeled Mobile Robot Obstacle Avoidance Using Compass and Ultrasonic Yousef Moh. Abueejela
More informationEstimation and Control of Lateral Displacement of Electric Vehicle Using WPT Information
Estimation and Control of Lateral Displacement of Electric Vehicle Using WPT Information Pakorn Sukprasert Department of Electrical Engineering and Information Systems, The University of Tokyo Tokyo, Japan
More informationBehaviour-Based Control. IAR Lecture 5 Barbara Webb
Behaviour-Based Control IAR Lecture 5 Barbara Webb Traditional sense-plan-act approach suggests a vertical (serial) task decomposition Sensors Actuators perception modelling planning task execution motor
More informationTowards Quantification of the need to Cooperate between Robots
PERMIS 003 Towards Quantification of the need to Cooperate between Robots K. Madhava Krishna and Henry Hexmoor CSCE Dept., University of Arkansas Fayetteville AR 770 Abstract: Collaborative technologies
More informationIMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL
IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL * A. K. Sharma, ** R. A. Gupta, and *** Laxmi Srivastava * Department of Electrical Engineering,
More informationSIMULTANEOUS OBSTACLE DETECTION FOR MOBILE ROBOTS AND ITS LOCALIZATION FOR AUTOMATIC BATTERY RECHARGING
SIMULTANEOUS OBSTACLE DETECTION FOR MOBILE ROBOTS AND ITS LOCALIZATION FOR AUTOMATIC BATTERY RECHARGING *Sang-Il Gho*, Jong-Suk Choi*, *Ji-Yoon Yoo**, Mun-Sang Kim* *Department of Electrical Engineering
More informationImplementation of Proportional and Derivative Controller in a Ball and Beam System
Implementation of Proportional and Derivative Controller in a Ball and Beam System Alexander F. Paggi and Tooran Emami United States Coast Guard Academy Abstract This paper presents a design of two cascade
More informationThe Autonomous Performance Improvement of Mobile Robot using Type-2 Fuzzy Self-Tuning PID Controller
, pp.182-187 http://dx.doi.org/10.14257/astl.2016.138.37 The Autonomous Performance Improvement of Mobile Robot using Type-2 Fuzzy Self-Tuning PID Controller Sang Hyuk Park 1, Ki Woo Kim 1, Won Hyuk Choi
More informationSRV02-Series Rotary Experiment # 3. Ball & Beam. Student Handout
SRV02-Series Rotary Experiment # 3 Ball & Beam Student Handout SRV02-Series Rotary Experiment # 3 Ball & Beam Student Handout 1. Objectives The objective in this experiment is to design a controller for
More informationTest Plan. Robot Soccer. ECEn Senior Project. Real Madrid. Daniel Gardner Warren Kemmerer Brandon Williams TJ Schramm Steven Deshazer
Test Plan Robot Soccer ECEn 490 - Senior Project Real Madrid Daniel Gardner Warren Kemmerer Brandon Williams TJ Schramm Steven Deshazer CONTENTS Introduction... 3 Skill Tests Determining Robot Position...
More informationPWM MOTOR DRIVE CIRCUIT WITH WIRELESS COMMUNICATION TO A MICROCOMPUTER FOR SMALL PLAYING SOCCER ROBOTS
PWM MOTOR DRIVE CIRCUIT WITH WIRELESS COMMUNICATION TO A MICROCOMPUTER FOR SMALL PLAYING SOCCER ROBOTS EWALDO L. M. MEHL, ANDERSON C. ZANI, JACKSON KÜNTZE, VILSON R. MOGNON Departamento de Engenharia Elétrica,
More informationSensor Data Fusion Using Kalman Filter
Sensor Data Fusion Using Kalman Filter J.Z. Sasiade and P. Hartana Department of Mechanical & Aerospace Engineering arleton University 115 olonel By Drive Ottawa, Ontario, K1S 5B6, anada e-mail: jsas@ccs.carleton.ca
More informationField Rangers Team Description Paper
Field Rangers Team Description Paper Yusuf Pranggonoh, Buck Sin Ng, Tianwu Yang, Ai Ling Kwong, Pik Kong Yue, Changjiu Zhou Advanced Robotics and Intelligent Control Centre (ARICC), Singapore Polytechnic,
More informationEvolving CAM-Brain to control a mobile robot
Applied Mathematics and Computation 111 (2000) 147±162 www.elsevier.nl/locate/amc Evolving CAM-Brain to control a mobile robot Sung-Bae Cho *, Geum-Beom Song Department of Computer Science, Yonsei University,
More informationUSING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER
World Automation Congress 21 TSI Press. USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER Department of Computer Science Connecticut College New London, CT {ahubley,
More informationReinforcement Learning Simulations and Robotics
Reinforcement Learning Simulations and Robotics Models Partially observable noise in sensors Policy search methods rather than value functionbased approaches Isolate key parameters by choosing an appropriate
More informationInternet Control of Personal Robot between KAIST and UC Davis
Internet Control of Personal Robot between KAIST and UC Davis Kuk-Hyun Han 1, Yong-Jae Kim 1, Jong-Hwan Kim 1 and Steve Hsia 2 1 Department of Electrical Engineering and Computer Science, Korea Advanced
More informationCYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS
CYCLIC GENETIC ALGORITHMS FOR EVOLVING MULTI-LOOP CONTROL PROGRAMS GARY B. PARKER, CONNECTICUT COLLEGE, USA, parker@conncoll.edu IVO I. PARASHKEVOV, CONNECTICUT COLLEGE, USA, iipar@conncoll.edu H. JOSEPH
More informationSoccer Server: a simulator of RoboCup. NODA Itsuki. below. in the server, strategies of teams are compared mainly
Soccer Server: a simulator of RoboCup NODA Itsuki Electrotechnical Laboratory 1-1-4 Umezono, Tsukuba, 305 Japan noda@etl.go.jp Abstract Soccer Server is a simulator of RoboCup. Soccer Server provides an
More informationA NEURAL CONTROLLER FOR ON BOARD TRACKING PLATFORM
A NEURAL CONTROLLER FOR ON BOARD TRACKING PLATFORM OCTAVIAN GRIGORE- MÜLER 1 Key words: Airborne warning and control systems (AWACS), Incremental motion controller, DC servomotors with low inertia induce,
More informationImproving the Kicking Accuracy in a Soccer Robot
Improving the Kicking Accuracy in a Soccer Robot Ricardo Dias ricardodias@ua.pt Bernardo Cunha mbc@det.ua.pt João Silva joao.m.silva@ua.pt António J. R. Neves an@ua.pt José Luis Azevedo jla@ua.pt Nuno
More informationAN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS
AN AUTONOMOUS SIMULATION BASED SYSTEM FOR ROBOTIC SERVICES IN PARTIALLY KNOWN ENVIRONMENTS Eva Cipi, PhD in Computer Engineering University of Vlora, Albania Abstract This paper is focused on presenting
More informationKMUTT Kickers: Team Description Paper
KMUTT Kickers: Team Description Paper Thavida Maneewarn, Xye, Korawit Kawinkhrue, Amnart Butsongka, Nattapong Kaewlek King Mongkut s University of Technology Thonburi, Institute of Field Robotics (FIBO)
More informationA Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server
A Study of Optimal Spatial Partition Size and Field of View in Massively Multiplayer Online Game Server Youngsik Kim * * Department of Game and Multimedia Engineering, Korea Polytechnic University, Republic
More informationCAMBADA 2015: Team Description Paper
CAMBADA 2015: Team Description Paper B. Cunha, A. J. R. Neves, P. Dias, J. L. Azevedo, N. Lau, R. Dias, F. Amaral, E. Pedrosa, A. Pereira, J. Silva, J. Cunha and A. Trifan Intelligent Robotics and Intelligent
More informationDr. Wenjie Dong. The University of Texas Rio Grande Valley Department of Electrical Engineering (956)
Dr. Wenjie Dong The University of Texas Rio Grande Valley Department of Electrical Engineering (956) 665-2200 Email: wenjie.dong@utrgv.edu EDUCATION PhD, University of California, Riverside, 2009 Major:
More informationDeveloping Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function
Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Davis Ancona and Jake Weiner Abstract In this report, we examine the plausibility of implementing a NEAT-based solution
More informationLearning and Using Models of Kicking Motions for Legged Robots
Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract
More informationAn Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots
An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots Maren Bennewitz Wolfram Burgard Department of Computer Science, University of Freiburg, 7911 Freiburg, Germany maren,burgard
More informationRobots in the Loop: Supporting an Incremental Simulation-based Design Process
s in the Loop: Supporting an Incremental -based Design Process Xiaolin Hu Computer Science Department Georgia State University Atlanta, GA, USA xhu@cs.gsu.edu Abstract This paper presents the results of
More informationBiologically Inspired Embodied Evolution of Survival
Biologically Inspired Embodied Evolution of Survival Stefan Elfwing 1,2 Eiji Uchibe 2 Kenji Doya 2 Henrik I. Christensen 1 1 Centre for Autonomous Systems, Numerical Analysis and Computer Science, Royal
More informationFuzzy-Heuristic Robot Navigation in a Simulated Environment
Fuzzy-Heuristic Robot Navigation in a Simulated Environment S. K. Deshpande, M. Blumenstein and B. Verma School of Information Technology, Griffith University-Gold Coast, PMB 50, GCMC, Bundall, QLD 9726,
More informationNavigation of Transport Mobile Robot in Bionic Assembly System
Navigation of Transport Mobile obot in Bionic ssembly System leksandar Lazinica Intelligent Manufacturing Systems IFT Karlsplatz 13/311, -1040 Vienna Tel : +43-1-58801-311141 Fax :+43-1-58801-31199 e-mail
More informationOptimum Rate Allocation for Two-Class Services in CDMA Smart Antenna Systems
810 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 51, NO. 5, MAY 2003 Optimum Rate Allocation for Two-Class Services in CDMA Smart Antenna Systems Il-Min Kim, Member, IEEE, Hyung-Myung Kim, Senior Member,
More informationAbout Doppler-Fizeau effect on radiated noise from a rotating source in cavitation tunnel
PROCEEDINGS of the 22 nd International Congress on Acoustics Signal Processing in Acoustics (others): Paper ICA2016-111 About Doppler-Fizeau effect on radiated noise from a rotating source in cavitation
More information