Learning to Acquire Whole-Body Humanoid Center of Mass Movements to Achieve Dynamic Tasks

Size: px
Start display at page:

Download "Learning to Acquire Whole-Body Humanoid Center of Mass Movements to Achieve Dynamic Tasks"

Transcription

1 Advanced Robotics 22 (2008) Full paper Learning to Acquire Whole-Body Humanoid Center of Mass Movements to Achieve Dynamic Tasks Takamitsu Matsubara a,b,, Jun Morimoto b,c, Jun Nakanishi b,c, Sang-Ho Hyon b,c, Joshua G. Hale b,c and Gordon Cheng b,c a Nara Institute of Science and Technology, Takayama-cho, Ikoma, Nara , Japan b ATR Computational Neuroscience Laboratories, Hikaridai Seika-cho, Soraku-gun, Kyoto , Japan c ICORP Computational Brain Project, Japan Science and Technology Agency, Honcho, Kawaguchi, Saitama , Japan Received 4 March 2008; accepted 19 March 2008 Abstract This paper presents a novel approach for acquiring dynamic whole-body movements on humanoid robots focused on learning a control policy for the center of mass (CoM). In our approach, we combine both a model-based CoM controller and a model-free reinforcement learning (RL) method to acquire dynamic whole-body movements in humanoid robots. (i) To cope with high dimensionality, we use a model-based CoM controller as a basic controller that derives joint angular velocities from the desired CoM velocity. The balancing issue can also be considered in the controller. (ii) The RL method is used to acquire a controller that generates the desired CoM velocity based on the current state. To demonstrate the effectiveness of our approach, we apply it to a ball-punching task on a simulated humanoid robot model. The acquired wholebody punching movement was also demonstrated on Fujitsu s Hoap-2 humanoid robot. Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2008 Keywords Reinforcement learning, humanoid robot, whole-body movement, policy-gradient method 1. Introduction Since their physical structure resembles humans, humanoid robots can be expected to help us with many tasks in our normal living environment, without specifically needing additional environmental customization. Therefore, interest continues to grow in the development of humanoid robots and their control methods to achieve whole-body dynamic movements in these systems [1 4]. In particular, over the last * To whom correspondence should be addressed. takam-m@is.naist.jp Koninklijke Brill NV, Leiden and The Robotics Society of Japan, 2008 DOI: / X324785

2 1126 T. Matsubara et al. / Advanced Robotics 22 (2008) decade, a number of methods for achieving various tasks on a humanoid robot have been explored, mainly to achieve biped walking and balancing [5 8]. Even though a number of real humanoid robots have demonstrated whole-body dynamic movements with these existing methods, it remains impossible to introduce humanoid robots into our living spaces to help us in our daily lives. This is in large part caused by their inability to adapt to new environments as easily as humans and animals, i.e., due to a lack of motor learning ability. One candidate solution for granting motor learning skills to humanoid robots is reinforcement learning (RL) a promising method because it requires no expert teachers or idealized desired behavior to improve skills. RL is a framework for improving the control rules of an agent, i.e., a robot, through iterative interaction with the environment based on a trial-and-error paradigm without using an explicit model of the environment [9, 10]. However, with increasing dimensionality in state and action spaces, RL often requires not only a large number of iterations, but also large computational cost, especially for learning a complex control policy motor learning. Although many researchers have attempted to apply RL methods to several robots in simulations and real hardware systems for acquiring desired movements, so far most of the robots to which learning has been successfully applied only have a small number of d.o.f., not as many as the d.o.f. typically offered by humanoid robots [11 15]. To the best of our knowledge, only one attempt has successfully learned the desired movements on a small humanoid robot [16]. That work focused on learning biped walking [16]. In this paper, we present a novel approach for acquiring dynamic whole-body movements on humanoid robots by focusing on learning a control policy for the center of mass (CoM). CoM is one of the most important features of humanoid robots because it approximately represents the whole-body motion of the humanoid robot. Moreover, as suggested by such experimental studies as Ref. [17], it can also be considered a control variable during functional tasks to overcome the curse of dimensionality in humans. Due to its dimensionality, learning a CoM movement for a given task might be simpler than directly learning all joint movements. Therefore, we propose combining a model-based CoM controller with a modelfree RL approach. A drawback of the model-based CoM controller is that the method only considers highly approximated dynamics, comprised of CoM and zero moment point (ZMP), to design a CoM controller. This approximation can cause poor tracking performance for a given desired trajectory. Model-based approaches are also always affected by modeling errors. However, we can derive joint angular velocities from the desired CoM velocity and can also explicitly consider the balancing with a model-based CoM controller. On the other hand, RL methods are applicable to improve the performance of controllers without using physical models and parameters. However, as described above, its drawback is that RL generally is not applicable to high-dimensional systems due to the curse of dimensionality [9]. Therefore, we cannot expect improvement of controllers for humanoid robots with

3 T. Matsubara et al. / Advanced Robotics 22 (2008) many d.o.f. within a realistic amount of time by naive application of RL to humanoid robots. In our approach, we combine a model-based CoM controller and a RL method to acquire dynamic whole-body movements on humanoid robots. (i) To cope with high-dimensionality, we use a model-based CoM controller as a basic CoM controller that derives joint angular velocities from the desired CoM velocity. The balancing issue can also be considered in the controller [5 8]. (ii) The RL method is used to acquire a controller that generates the desired CoM velocity based on the current state. While RL methods generally do not require the physical model and parameters of the robot, the learning system needs to be a Markov decision process for most standard approaches based on such value functions as Q-learning. Since we only consider CoM position and time as state variables, the learning system becomes a partially observable Markov decision process (POMDP). Therefore, we use a policy-gradient method that can be applied to the POMDP. We demonstrate that our proposed approach efficiently acquires the appropriate policies for a ball-punching task on a numerically simulated humanoid robot model of Fujitsu s Hoap-2. The acquired whole-body punching movement is demonstrated in a real hardware system as well as in simulations. The paper is organized as follows. In Section 2, we briefly describe our approach for learning desired whole-body movements on a humanoid robot by focusing on its CoM. In Section 3, we briefly introduce ZMP and its equation. Next we describe how we can control CoM by manipulating ZMP based on the ZMP equations. In Section 4, we present the policy-gradient method for learning an appropriate control policy for the desired full-body movements of a humanoid robot. In Section 5, we present a concrete example of the learning system in a ball-punching task with a humanoid robot. In Section 6, we describe the results achieved by applying the proposed method in numerical simulations. In Section 7, we demonstrate the acquired whole-body punching movement on a real robot. 2. Learning a Desired Whole-Body Movement on a Humanoid Robot: FocusedonLearningCoMMovements In this section, we briefly describe our approach for learning the desired wholebody movements on a humanoid robot. The approach focuses on learning CoM movement suitable to achieve the task on a humanoid robot. Figure 1 shows a rough sketch of our proposed approach. x is a state variable which is (partial) information of the robot and a is control output for learning. In this paper, we focus on learning CoM movement, i.e., the control output is the desired velocity of the CoM ṙ CoM ref as a = ṙ CoM ref. c(x) is reward function that evaluates each control decision. π(x, a; w) is a control policy in which the parameter w is learned to maximize the accumulated reward function. q is the desired angular velocities. As long as both CoM and ZMP are inside the support polygon during CoM control, the robot can be prevented from falling over [18]. This characteristic makes the

4 1128 T. Matsubara et al. / Advanced Robotics 22 (2008) Figure 1. Learning system to acquire desired whole-body movements on a humanoid robot. x is state variable and a is control output for learning. We focus on learning CoM movement, i.e., control output is the desired velocity of CoM ṙ CoM ref as a = ṙ CoM ref. c(x) is a reward function that evaluates each control decision. π(x, a; w) is control policy in which parameter w is learned to maximize accumulated reward function. q is the desired angular velocities. under-actuated robot system apparently become like a fully actuated system, which simplifies its motor learning task and increases its tractability. Thus, our approach also contains such a ZMP manipulation method, i.e., in our approach, policygradient-type reinforcement learning is applied to learn the CoM controller on such a ZMP manipulating method. The acquired controller is expected to implicitly consider the dynamics of the robot, e.g., friction and inertia, and information about the task, which are not explicitly considered in the model-based CoM controller. A CoM Jacobian-based redundancy resolution technique is utilized to compute angular velocities for all joints to achieve a whole-body movement consistent with a desired CoM movement [7]. We use a manually tuned weighting matrix to compute the weighted pseudo-inverse computation to achieve desirable joints configuration and avoid joint limits. Thus, our learning system is composed of two components that are introduced in the following two sections: (i) CoM control based on ZMP and distribution of CoM movement into joint space, and (ii) RL for CoM movement. 3. CoM Controller Based on a ZMP Equation This section describes a method for controlling the CoM based on ZMP manipulation [5, 7]. ZMP compensation control is a method to compensate the current ZMP as an objective point [5]. The PID controller is used to calculate the objective ZMP based on an analogy between the inverted pendulum and the CoM ZMP dynamics in a mass-concentrated humanoid robot model [7]. By integrating the two components, which are presented in Sections 3.1 and 3.2, CoM can be controlled by manipulating ZMP. The CoM Jacobian-based redundancy resolution technique, described in Section 3.3, is utilized to calculate the movements in the all joint-space consistent with the desired CoM movements.

5 T. Matsubara et al. / Advanced Robotics 22 (2008) ZMP Compensation Control According to Nagasaka [5], assuming a mass-concentrated model, the relationship between the moment acting on the ZMP and the objective ZMP is given as: n ZMP = n OZMP + (r OZMP r ZMP ) f CoM (1) n OZMP = (r CoM r OZMP ) f CoM, (2) where n ZMP R 3 and n OZMP R 3 are ZMP and objective ZMP moments, respectively. r ZMP R 3 and r OZMP R 3 are the position vector of ZMP and objective ZMP from the origin, respectively, and f CoM R 3 is the force acting on the CoM. From the definition of the ZMP, which is a point such that the horizontal components of the moment acting at the point are zero, we can derive a control law to compensate the ZMP to the objective ZMP by kinematically manipulating CoM as follows: rx,i+1 CoM = K( rx ZMP rx OZMP ) + r CoM x,i ry OZMP ) + r CoM + ( r CoM x,i rx,i 1 CoM ) + K r CoM x,i (3) ry,i 1 CoM ) + K r CoM y,i, (4) ry,i+1 CoM = K( ry ZMP y,i + ( ry,i CoM where K = fz,i CoM t 2 /(rz,i CoM rz,i OZMP ) and t is a discrete time step. r is the deviation of position during t. The desired velocity of CoM can be straightforwardly approximated as ṙ CoM x ṙ CoM y rx,i+1 CoM / t (5) ry,i+1 CoM / t. (6) Under such control, the robot can be regarded as an inverted pendulum with its supporting point at the objective ZMP Calculating the Reference ZMP Based on Inverted Pendulum Model As mentioned above, since the horizontal components of the moment on the ZMP are zero, the mass-concentrated model of the humanoid robot can be regarded as an inverted pendulum. Based on this analogy, we apply a simple PID controller to control the CoM by manipulating the ZMP as described in Ref. [7]. The dynamics of the mass-concentrated model approximately linearized around an equilibrium point are given as: r CoM x r CoM y = ω 2( r CoM x = ω 2( r CoM y rx ZMP ) (7) ry ZMP ), (8) r where ω = CoM z +g. The above dynamics equations represent the horizontal rz CoM rz ZMP movement of CoM. Due to the symmetry of the x and y components, we can focus

6 1130 T. Matsubara et al. / Advanced Robotics 22 (2008) on the x component in the following derivation without loss of generality. By differentiating (7) and ignoring the change in ω by assuming that r CoM z = 0andrz CoM are constant, the following equation can be derived:... r CoM x = ω 2( ṙ CoM x ṙ ZMP ) x. (9) To control CoM rx CoM controller as ṙ ZMP (ṙcom x (t) = K P x (ṙcom + K I with reference r CoM x ref as a target, we can apply the following x ref ṙcom ) x ref ṙcom x ) dt + KD ( r CoM x ref rcom x ṙ CoM x ref = K ( C r CoM x ref rx CoM ). (11) K P, K I, K D and K C are gains. By the final-value theorem, it may be proven that rx CoM converges to rx CoM ref with appropriate settings for the gains. By integrating the two components presented in this section, the CoM can be controlled by manipulating ZMP. In the next section, we describe a CoM Jacobianbased redundancy resolution technique to achieve whole-body movement consistent with the desired CoM movement Distributing the CoM Movement Into Joint Space In this section, we present a CoM Jacobian-based redundancy resolution technique to achieve whole-body movement consistent with desired CoM movement [7]. We also present the CoM controller used in our framework that is based on the CoM Jacobian Distributing CoM Movements Through the CoM Jacobian Sugihara et al. [7] utilized a calculation method for the CoM Jacobian with legged systems, that was originally proposed by Boulic et al. [19]. The CoM Jacobian relates CoM velocity with the angular velocities of all joints as: ṙ CoM = J C (q) q, (12) where J C (q) R 3 n is the CoM Jacobian and n is the the number of d.o.f. in the robot. By using the CoM Jacobian and the weighted pseudo-inverse calculation, we can distribute the CoM velocity to the angular velocities of all the joints based on sum-squared minimization applied to all the joint angular velocities as follows: q = J + C ṙcom + (I J + C J C)k, (13) where: J + C = W 1 J T C( JC W 1 J T 1. C) (14) W = diag{w i } (i = 1,...,n),andk R n is an arbitrary vector. I R n n is an identity matrix. The above redundancy resolution technique with a weighting matrix determines whole-body motion consistent with the desired CoM movements. ) (10)

7 T. Matsubara et al. / Advanced Robotics 22 (2008) Figure 2. Definition of the variables CoM Jacobian-Based Redundancy Resolution in the Double-Support Case We used the following mapping to control the CoM by all joints as: q = J + ṙ + (I J + J)k, (15) where ṙ R 6 =[ṙ C ṙ rl, ṙ C ṙ ll ] T and J(q) R 6 n =[J C (q) J rl (q), J C (q) J ll (q)] T. k R 6 is an arbitrary vector. r C is a position vector of CoM from the base-link defined on the waist, and r ll and r rl are the position vectors of the left and right feet from the base-link, respectively. ṙ and J(q) are the corresponding velocity vector and the Jacobian of each r defined above, respectively. The variables are defined in Fig. 2. The desired ṙ to control the CoM based on the desired trajectory is given by (5), (6) and (10). 4. RL for CoM Movement In this section, we present a RL method for the proposed learning framework. For learning CoM movement, we use a policy-gradient method, which is a kind of RL method that maximizes the average reward with respect to parameters controlling action rules known as policy [11, 20, 21]. Compared with most standard value function-based RL methods, such a method has particular features suited to robotic applications. First, the policy-gradient method is applicable to POMDPs [22]. Considering all possible states of the robot is almost impossible, because even if it has a complete set of sensors there will be a certain degree of noise. It is also possible to consider a partial set of states as input for a RL system. Second, the policy-gradient method is a stochastic gradient-descent scheme. The policy can, therefore, be improved with every update. In this section, we briefly describe a framework for RL with the policy-gradient method RL With a Policy-Gradient Method Assuming a Markov decision process, average reward, discounted cumulative re-

8 1132 T. Matsubara et al. / Advanced Robotics 22 (2008) ward and value functions are defined as: [ η(θ) = lim E 1 T T [ η β (θ) = lim E 1 T T ] T c(x t ) t=0 ] T βc(x t ) t=0 [ Vβ π (x) = E ] βc t+k+1 x t = x k=0 (16) (17) (18) [ Q π β (x, a) = E ] βc t+k+1 x t = x,a t = a, (19) k=0 where x S is state and c(x): S R is immediate reward. η(θ) is the average reward and η β (θ) is the discounted cumulative reward. Vβ π(x) and Qπ β (x, a) are the state-value and action-value function, respectively [9]. x is the state, a is the action and θ is the parameter of the stochastic policy. β is a discounting factor. The goal of RL is to maximize the average reward. If we calculate the gradient of η(θ) with respect to policy parameters θ, we can search for a locally optimal policy in the policy parameter space by updating the parameters as θ θ + α η(θ). η(θ) is the gradient of η(θ) with respect to θ. Various derivations and algorithms have been proposed to estimate the gradient based on sampling through interaction with the environment. According to Kimura and Kobayashi [23], the gradient is given by: η = (1 β) η β (20) [ = (1 β) d(x) π(a,x) log d(x) + 1 ] log π(a,x) Q π β (x, a) da dx (21) 1 β = d(x) 1 = lim T,β 1 T 1 = lim T,β 1 T π(a,x)[(1 β) log d(x) t=0 + log π(a,x)] { Q π β (x, a) V π β (x)} da dx (22) T T log π(a t,x t ) β s t δ(x s,a s ) T δ(x t,a t ) t=0 s=t t β t s log π(a s,x s ). (23) s=0 Here, π(x,a; θ) = P(a x; θ) is a stochastic policy that maps state x to action a stochastically. π(x,a; θ) means the gradient of π(x,a; θ) with respect to θ.

9 T. Matsubara et al. / Advanced Robotics 22 (2008) d(x) is the stationary distribution of x. δ(x,a) is TD error defined as δ(x t,a t ) = c(x t ) + β p(x t+1 x t,a t )Vβ π(x t+1) dx t Vβ π(x t). Equation (20) is presented in Ref. [24] as Theorem 1 and (21) is derived in Ref. [25]. The derivation of (22) is based on π(x,a)vβ π(x) da = 0. If we neglect V β π (x), the algorithm is identical to the GPOMDP algorithm developed in Ref. [21]. As pointed out in Ref. [21], discounting factor β controls a bias variance trade-off in the policy-gradient estimated by sampling. In fact, we update the policy parameters based on the following rule: θ t+1 = θ t + αd t δ(x t,a t ),whered is updated by D t = βd t 1 + log π(x t,a t ).However, to derive TD error δ(x t,a t ), we need the state-value function Vβ π (x). In this paper, we simultaneously approximate it using the function approximator ˆV β π (x; w) with parameter w and a simple TD learning method presented as w = w + αδ t ˆV π β (x;w) w. TD error δ(x t,a t ) is then approximately calculated by δ(x t,a t ) = c(x t ) + β ˆV π β (x t+1) ˆV π β (x t). Note that β should satisfy 0 β<1topreventthe state-value function from diverging. 5. Application to Learning of a Dynamic Task: Ball-Punching In previous sections, we presented our learning approach that focussed on CoM movement to achieve whole-body movement. We applied the proposed approach for learning whole-body movement on a humanoid robot and selected a ball-punching task. The goal was to strengthen ball-punching through a learning process that focused on CoM movement. In this section, the details of the learning settings are described. We then present the numerical simulation and experimental results in a real environment in the next two sections Learning CoM Movement for Whole-Body Dynamic Punching In this paper, we focus on controlling the x-axis component of the CoM, i.e., the policy output is the target velocity of the x-axis component of CoM ṙ CoM x ref. Thus, action in the control policy for learning is defined as a =ṙ CoM x ref. To simplify the task, we constrained the desired CoM to one-dimensional movement. The policy output is distributed to the x-andy-axis components of CoM as ṙ CoM x = sin(ψ) ṙ CoM x ref and ṙ CoM y = cos(ψ) ṙ CoM x ref,whereψ is the angle from the y-axis to the x-axis clockwise and ψ = π/3 as depicted in Fig. 3. This setting can be considered to sufficiently use the area of the support polygon because a diagonal line is larger than x- andy-axis lines. State-space was simply defined as x = (rx CoM,t). Note that the state of the dynamics of the humanoid robot to which the learning is applied is not such a small dimensional variable, even though the inverted pendulum-based controller simplified it, as explained in Section 3. However, the position of CoM remains one of most dominant variables and time t is also important to coordinate the timing of the pre-designed punching motion. Thus, with the above notion and the applicability

10 1134 T. Matsubara et al. / Advanced Robotics 22 (2008) Figure 3. One-dimensional CoM movement controlled by a policy is shown by grey line; the solid line is each foot and the dashed line means the support polygon. of the policy-gradient method to such partially observable cases [21], we simply designed the state-space for the above learning Gaussian Policy and Function Approximator for the State-Value Function We implemented the following Gaussian policy as a stochastic policy for controlling the CoM: π(x,a; θ) = 1 2πσ exp ( (a μ(x; θ)) 2 2σ 2 ), (24) where μ(x; θ) = θ T φ(x). x is the state and a is the action. In this study, a radial basis function network is used as a model of the feedback controller. Since it is almost impossible to manually design all the network parameters, using the policygradient method is useful to optimize them. We located Gaussian basis functions φ(x) on a grid with even intervals in each dimension of the observation space as in Refs [10, 15]. The function approximator for the state-value function is also modeled as ˆV β π(x) = wt φ(x). We allocated 100 (=10 10) basis functions φ(x) in state-space ( 1.0 <rx CoM < 0.0, 0.5 <t<4.0) to represent the mean of the policy μ(x) Reward Function The purpose of the ball-punching task is to strengthen the punching as much as possible. We designed the reward function based on this objective as: c = (t t b ) v T b v b, (25) because the velocity of the ball v b punched is proportional to its momentum. The term associated with time t is incorporated in the reward function to avoid local minima motions, which involve the robot falling forward and ignore the timing of the punch. t b is bias to distribute the reward to positive and negative, which is set as

11 T. Matsubara et al. / Advanced Robotics 22 (2008) in this study. Negative reward 5 is given when both feet leave the ground to avoid acquiring a punching motion with jumping Punching Motion Projected on Null Space of the CoM Controller A punching motion was straightforwardly implemented by tracking the target trajectory in the task space. In this study, we achieved tracking control in the null space of the CoM controller by introducing the following vector as an arbitrary vector in (15): k = J + ra (ṙ ra J ra J + ṙ), (26) where J ra R 3 n is the Jacobian relating to the right hand velocity in task space ṙ ra with q as ṙ ra = J ra q, and J + ra = J ra(i J + J). Introducing this vector yields target tracking with the right hand in the null space of the CoM controller [26]. 6. Numerical Simulations 6.1. Settings and Results We applied the proposed approach to the acquisition of a strong punching movement on Fujitsu s Hoap-2 humanoid robot (see Fig. 4) in numerical simulation. The ball was modeled as a simple point mass (0.1 kg), and the contact between the robot and the ball was simulated by a spring damper model. A spring damper model was also used to model the floor. The integration time-step for the robot was 0.2 ms, and the time interval for learning was 50 ms. For the CoM and right-arm controllers, a weighting matrix suitable for this task must be set to appropriately achieve whole-body motion in (14). To avoid using the d.o.f. in the right arm (which are used for the punching motion) for the CoM controller, the weights in the right arm were set smaller (0.01) than the other joints (1.0) in the CoM controller described in (13). For the right-arm controller described in (26), to achieve a punching motion mainly using the right arm, we set the weights in the body joint larger (3.0) than other joints (1.0). The target trajectory for the right-arm controller to achieve a punching motion was designed as Figure 4. Fujitsu humanoid robot Hoap-2 (21 d.o.f.): 6 d.o.f. for the legs, 4 d.o.f. for arms and 1 d.o.f. for the waist. Total weight is about 7 kg, and height is about 0.4 m.

12 1136 T. Matsubara et al. / Advanced Robotics 22 (2008) Figure 5. Acquired reward at each episode. The learning curve was averaged over five experiments and smoothed by taking a 50 moving average. Figure 6. Acquired control policy for the axis component of the CoM. r rax ref = p sin(2πf(t t a )) + q, where(t t a ), and we set the parameters so that amplitude p = 0.03 m, bias q = 0.21 m, frequency f = 1.5 Hz and bias t a = 3.5 by considering Hoap-2 s physical model. While 0 <t<3.5, r rax ref is constant p. Figure 5 shows the reward at each episode based on the policy-gradient method. The curve means that the locally optimal punching motion with maximal reward was acquired around 2000 episodes. Figure 6 is an acquired policy for controlling the x-axis component of CoM and Fig. 7 presents a whole-body punching motion acquired by the control-policy. While keeping the CoM at the initial point, the punching motion produced a ball momentum of about kgm/s. The acquired punching motion without any probabilistic factors produced an average ball momentum of about kgm/s

13 T. Matsubara et al. / Advanced Robotics 22 (2008) Figure 7. Acquired whole-body punching movement. Snapshots correspond to 0.0, 0.85, 1.40, 2.16 and 2.33 s, respectively. The grey bar on the foot denotes ground reaction force. Figure 8. CoM trajectories generated with the learned control policy from various initial CoM positions. (standard deviation was 0.005), which means the ball momentum generated by the learned policy was about 2.3-times larger than the initial performance. Note that the acquired control policy is not a simple trajectory. Figure 8 presents the x-axis CoM trajectories with an acquired control policy from various initial conditions. To achieve a strong punching motion, the x-axis CoM position must be about 0.02 m to guarantee that the right arm can kinematically reach the ball. When the robot hits the ball, the CoM also requires high velocity for strong punching. The acquired policy for various initial conditions tends to move the CoM backward from the ball at the beginning. Then, it accelerates and propels the CoM forward, achieving high velocity when its position is about 0.02 m and is coordinated with the pre-designed right-arm movement for strong punching. Thus, the acquired control policy is a complex feedback controller to achieve strong punching Robustness of Learning Against Modeling Error As presented in previous sections, our suggested approach requires such robot information as mass, length and the position of the CoM in each link to calculate

14 1138 T. Matsubara et al. / Advanced Robotics 22 (2008) the position of the CoM and its Jacobian. Even through having perfectly accurate parameters would be desirable, our approach can be robust to estimate errors of such parameters, because the control policy of the CoM is acquired through iterative interaction with the environment. To investigate its robustness, we applied the learning in simulations with the following settings: (i) mass of the right arm s tip is over-estimated as double the true parameter and (ii) position of body mass is 0.01 m biased in the x-axis direction. In both cases, the appropriate control policy for the CoM was acquired as in the normal settings. The resulting rewards with acquired policies for (i) and (ii) through 2000 trials were 1.57 and 1.89, which were averaged for five experiments and smoothed by taking a 50 moving average. These results suggest its robustness to modeling errors. 7. Experiments on a Real Hardware System In this section, we implemented the proposed controller on Hoap-2 a real humanoid robot. We implemented the CoM trajectories generated in simulations with the acquired control policy for CoM. To show the effectiveness of the learned punching motions, we set a toy car in front of the robot as a punching target. The distance the toy car is punched can measure the effectiveness of the initial and learned punches. Figure 9 provides sequential snapshots of the car being hit. The upper and lower sequences are the initial and learned movements, respectively. The results suggest that the punching motion, i.e., the acquired cooperative whole-body movement, is effective even in a real environment. Figure 9. Sequential snapshots for punching motion with (a) initial (car speed was 0.42 m/s) and (b) learned (car speed was 0.71 m/s) control policies. Each picture corresponds to 0.0, 0.67, 1.67 and 2.0 s from timing of impact. From the car s movement after being punched, the learned punching significantly affected the impact on the car.

15 8. Conclusions T. Matsubara et al. / Advanced Robotics 22 (2008) This paper presented an approach for acquiring dynamic whole-body movements on humanoid robots that focused on learning a control policy for the CoM to produce dynamic movements in achieving tasks. We applied the framework to the learning of a dynamic ball-punching motion on a Hoap-2 model in numerical simulations. As a result, we demonstrated that acquiring dynamic punching motions is possible through learning using our approach. We achieved the task with significantly fewer trials while accounting for the original complexity of the task and robot. The acquired cooperative whole-body punching movement was also demonstrated on a real hardware platform. As future work, we wish to explore on-line learning within a real environment because the proposed framework is also suitable for such situations. References 1. K. Hirai, M. Hirose, Y. Haikawa and T. Takenaka, The development of handa humanoid robot, in: Proc. IEEE Int. Conf. on Robotics and Automation, Leuven, pp (1998). 2. Y. Kuroki, T. Ishida, J. Yamaguchi, M. Ujita and T. Doi, A small biped entertainment robot, in: Proc. IEEE RAS Int. Conf. on Humanoid Robots, Tokyo, pp (2001). 3. J. Morimoto, G. Endo, J. Nakanishi, S. Hyon, G. Cheng, D. Bentivegna and C. Atkeson, Modulation of simple sinusoidal patterns by a coupled oscillator model for biped walking, in: Proc. IEEE Int. Conf. on Robotics and Automation, Orlando, FL, pp (2006). 4. S. Hyon and G. Cheng, Passivity-based whole-body motion control for humanoids: gravity compensation, balancing and walking, in: Proc. IEEE Int. Conf. on Intelligent Robots and Systems, Beijing, pp (2006). 5. K. Nagasaka, The whole-body motion generation of humanoid robot using dynamics filter (in japanese), PhD Thesis, University of Tokyo (2000). 6. S. Kagami, F. Kanehiro, Y. Tamiya, M. Inaba and H. Inoue, Autobalancer: an online dynamic balance compensation scheme for humanoid robots, in: Algorithmic and Computational Robotics: New Directions, B. R. Donald, K. Lynch and D. Rus (Eds), pp A. K. Peters, Wellesley, MA (2001). 7. T. Sugihara and Y. Nakamura, Whole-body cooperative balancing of humanoid robot using COG Jacobian, in: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Lausanne, pp (2002). 8. S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Harada, K. Yokoi and H. Hirukawa, Resolved momentum control: humanoid motion planning based on the linear and anguler momentum, in: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Las Vegas, NV, pp (2003). 9. R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998). 10. K. Doya, Reinforcement learning in continuous time and space, Neural Comput. 12, (2000). 11. H. Kimura, K. Miyazaki and S. Kobayashi, Reinforcement learning in POMDPs with function approximation, in: Proc. 14th Int. Conf. on Machine Learning, Nashville, TN, pp (1997).

16 1140 T. Matsubara et al. / Advanced Robotics 22 (2008) H. Kimura, T. Yamashita and S. Kobayashi, Reinforcement learning of walking behavior for a four-legged robot, in: Proc. IEEE Conf. on Decision and Control, Orlando, FL, pp (2001). 13. J. Morimoto and K. Doya, Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning, Robotics Autonomous Systems 36, (2001). 14. R. Tedrake, T. W. Zhang and H. S. Seung, Stochastic policy gradient reinforcement learning on a simple 3D biped, in: Proc. IEEE Int. Conf. on Intelligent Robots and Systems, Sendai, pp (2004). 15. T. Matsubara, J. Morimoto, J. Nakanishi, M. Sato and K. Doya, Learning sensory feedback to CPG for biped locomotion with policy gradient, in: Proc. IEEE Int. Conf. on Robotics and Automation, Barcelona, pp (2005). 16. G. Endo, J. Morimoto, T. Matsubara, J. Nakanishi and G. Cheng, Learning CPG sensory feedback with policy gradient for biped locomotion for a full body humanoid, in: Proc. 12th Natl. Conf. on Artificial Intelligence, Pittsburgh, PA, pp (2005). 17. J. Scholz and G. Schoner, The uncontrolled manifold concept: identifying control variables for a functional task, Exp. Brain Res. 126, (1999). 18. M. Vukobratović and B. Borovac, Zero-moment point thirty five years of its life, Int. J. Humanoid Robotics 1, (2004). 19. R. Boulic, R. Mas and D. Thalmann, Inverse kinetics for center of mass position control and posture optimization, in: Proc. Eur. Workshop on Combined Real and Synthetic Image Processing for Broadcast and Video Production, Hamburg (1994). 20. R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learn. 8, (1992). 21. J. Baxter and P. L. Bartlett, Infinite-horizon policy-gradient estimation, J. Artif. Intell. Res. 15, (2001). 22. D. A. Aberdeen, Policy-gradient algorithms for partially observable Markov decision processes, PhD Thesis, Australian National University (2003). 23. H. Kimura and S. Kobayashi, An analysis of actor/critic algorithms using eligibility traces: reinforcement learning with imperfect value function, in: Proc. Int. Conf. on Machine Learning, Madison, WI, pp (1998). 24. J. Baxter and P. L. Bartlett, Direct gradient-based reinforcement learning: I. Gradient estimation algorithms, Technical Report, Australian National University (1999). 25. R. S. Sutton, D. McAllester, S. Singh and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, Adv. Neural Information Proc. Syst. 12, pp (2000). 26. T. Yoshikawa, Foundations of Robotics: Analysis and Control. MIT Press, Cambridge, MA (1990). About the Authors Takamitsu Matsubara received the BE in Electrical and Electronic Systems Engineering from Osaka Prefecture University, Japan, in 2003, the ME in Information Science from Nara Institute of Science and Technology, Nara, in 2005, and the PhD in Information Science from Nara Institute of Science and Technology, Nara, in From 2005 to 2007, he was a Research Fellow (DC1) of the Japan Society for the Promotion of Science. He is currently an Assistant Professor of Nara Institute of Science and Technology and Visiting Researcher at ATR Computational Neuroscience Laboratories, Kyoto. His research interests include

17 T. Matsubara et al. / Advanced Robotics 22 (2008) reinforcement learning, machine learning and robotics. Jun Morimoto is a Senior Researcher at ATR Computational Neuroscience Laboratories and with the Computational Brain Project, ICORP, JST. He received the PhD in Information Science from Nara Institute of Science and Technology, Nara, in He was a Research Assistant with the Kawato Dynamic Brain Project, ERATO, JST, from 1999 to From 2001 to 2002, he was a Postdoctoral Fellow at the Robotics Institute, Carnegie Mellon University, Pittsburgh, PA. He Jointed ATR in 2002 and then joined JST, ICORP in Jun Nakanishi received the BE and ME degrees both in Mechanical Engineering from Nagoya University, Nagoya, in 1995 and 1997, respectively. He received the PhD degree in Engineering from Nagoya University in He also studied in the Department of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor, MI, from 1995 to He was a Research Associate at the Department of Micro System Engineering, Nagoya University, from 2000 to 2001, and was a Presidential Postdoctoral fellow at the Computer Science Department, University of Southern California, Los Angeles, CA, from 2001 to He joined ATR Human Information Science Laboratories, Kyoto, in He is currently a Researcher at ATR Computational Neuroscience Laboratories and with the Computational Brain Project, ICORP, Japan Science and Technology Agency. His research interests include motor learning and control in robotic systems. He received the IEEE ICRA 2002 Best Paper Award. Sang-Ho Hyon received the MS degree in Mechanical Engineering from Waseda University, in 1998, and the PhD degree in Control Engineering from the Tokyo Institute of Technology, in He has been a Research Associate and Assistant Professor of Tohoku University during He has been developing various legged robots and the controllers, performing various dynamic locomotion experiments such as jumping, running, walking and somersaulting. He is currently a Researcher at ATR Computational Neuroscience Laboratories, Japan. From 2005 to 2007, he was a researcher at JST International Cooperative Research Project, Computational Brain Project. He was a 1999 ICRA Best Paper Award Finalist. His primary research interests are legged locomotion, nonlinear oscillation and nonlinear control. He is a member of the RSJ and the IEEE Robotics and Automation Society. Joshua G. Hale received the BA (Hons 1st) degree in computation from the University of Oxford, in 1997, the MS (Dist.) degree in Computer Science from the University of Edinburgh, in 1998, the MA degree in Computation from the University of Oxford, in 2002, and the PhD degree on Biomimetic Motion Synthesis from the University of Glasgow, in He has worked as a Research Engineer at the Hardware Compilation Group at the University of Oxford, as a Research Assistant at the Computer Vision and Graphics Laboratory at the University of Glasgow, and is currently employed as an Researcher at the Humanoid Robotics and Computational Neuroscience Laborotory at ATR in Japan. His research interests include dynamic simulation, humanoid robotics and robot skill acquisition, computer graphics and three-dimensional modelling, and human motion production and perception.

18 1142 T. Matsubara et al. / Advanced Robotics 22 (2008) Gordon Cheng received the BS and MS degrees in Computer Science from the University of Wollongong, Wollongong, NSW, and the PhD degree in Systems Engineering from the Department of Systems Engineering, Australian National University, Acton, ACT. His current research interests include humanoid robotics, cognitive systems, biomimetics of human vision, computational neuroscience of vision, action understanding, human robot interaction, active vision, mobile robot navigation and object-oriented software construction. He is on the Editorial Board of the International Journal of Humanoid Robotics. He is a Senior Member of the IEEE Robotics and Automation Society and the IEEE Computer Society.

Learning to acquire whole-body humanoid CoM movements to achieve dynamic tasks

Learning to acquire whole-body humanoid CoM movements to achieve dynamic tasks 27 IEEE International Conference on Robotics and Automation Roma, Italy, -4 April 27 ThC9.5 Learning to acquire whole-body humanoid CoM movements to achieve dynamic tasks Takamitsu Matsubara,JunMorimoto,

More information

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball

Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball Optic Flow Based Skill Learning for A Humanoid to Trap, Approach to, and Pass a Ball Masaki Ogino 1, Masaaki Kikuchi 1, Jun ichiro Ooga 1, Masahiro Aono 1 and Minoru Asada 1,2 1 Dept. of Adaptive Machine

More information

Shuffle Traveling of Humanoid Robots

Shuffle Traveling of Humanoid Robots Shuffle Traveling of Humanoid Robots Masanao Koeda, Masayuki Ueno, and Takayuki Serizawa Abstract Recently, many researchers have been studying methods for the stepless slip motion of humanoid robots.

More information

Integration of Manipulation and Locomotion by a Humanoid Robot

Integration of Manipulation and Locomotion by a Humanoid Robot Integration of Manipulation and Locomotion by a Humanoid Robot Kensuke Harada, Shuuji Kajita, Hajime Saito, Fumio Kanehiro, and Hirohisa Hirukawa Humanoid Research Group, Intelligent Systems Institute

More information

Pushing Manipulation by Humanoid considering Two-Kinds of ZMPs

Pushing Manipulation by Humanoid considering Two-Kinds of ZMPs Proceedings of the 2003 IEEE International Conference on Robotics & Automation Taipei, Taiwan, September 14-19, 2003 Pushing Manipulation by Humanoid considering Two-Kinds of ZMPs Kensuke Harada, Shuuji

More information

UKEMI: Falling Motion Control to Minimize Damage to Biped Humanoid Robot

UKEMI: Falling Motion Control to Minimize Damage to Biped Humanoid Robot Proceedings of the 2002 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems EPFL, Lausanne, Switzerland October 2002 UKEMI: Falling Motion Control to Minimize Damage to Biped Humanoid Robot Kiyoshi

More information

The Tele-operation of the Humanoid Robot -Whole Body Operation for Humanoid Robots in Contact with Environment-

The Tele-operation of the Humanoid Robot -Whole Body Operation for Humanoid Robots in Contact with Environment- The Tele-operation of the Humanoid Robot -Whole Body Operation for Humanoid Robots in Contact with Environment- Hitoshi Hasunuma, Kensuke Harada, and Hirohisa Hirukawa System Technology Development Center,

More information

Sensor system of a small biped entertainment robot

Sensor system of a small biped entertainment robot Advanced Robotics, Vol. 18, No. 10, pp. 1039 1052 (2004) VSP and Robotics Society of Japan 2004. Also available online - www.vsppub.com Sensor system of a small biped entertainment robot Short paper TATSUZO

More information

Design and Experiments of Advanced Leg Module (HRP-2L) for Humanoid Robot (HRP-2) Development

Design and Experiments of Advanced Leg Module (HRP-2L) for Humanoid Robot (HRP-2) Development Proceedings of the 2002 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems EPFL, Lausanne, Switzerland October 2002 Design and Experiments of Advanced Leg Module (HRP-2L) for Humanoid Robot (HRP-2)

More information

Running Pattern Generation for a Humanoid Robot

Running Pattern Generation for a Humanoid Robot Running Pattern Generation for a Humanoid Robot Shuuji Kajita (IST, Takashi Nagasaki (U. of Tsukuba, Kazuhito Yokoi, Kenji Kaneko and Kazuo Tanie (IST 1-1-1 Umezono, Tsukuba Central 2, IST, Tsukuba Ibaraki

More information

Rapid Development System for Humanoid Vision-based Behaviors with Real-Virtual Common Interface

Rapid Development System for Humanoid Vision-based Behaviors with Real-Virtual Common Interface Rapid Development System for Humanoid Vision-based Behaviors with Real-Virtual Common Interface Kei Okada 1, Yasuyuki Kino 1, Fumio Kanehiro 2, Yasuo Kuniyoshi 1, Masayuki Inaba 1, Hirochika Inoue 1 1

More information

A Semi-Minimalistic Approach to Humanoid Design

A Semi-Minimalistic Approach to Humanoid Design International Journal of Scientific and Research Publications, Volume 2, Issue 4, April 2012 1 A Semi-Minimalistic Approach to Humanoid Design Hari Krishnan R., Vallikannu A.L. Department of Electronics

More information

4R and 5R Parallel Mechanism Mobile Robots

4R and 5R Parallel Mechanism Mobile Robots 4R and 5R Parallel Mechanism Mobile Robots Tasuku Yamawaki Department of Mechano-Micro Engineering Tokyo Institute of Technology 4259 Nagatsuta, Midoriku Yokohama, Kanagawa, Japan Email: d03yamawaki@pms.titech.ac.jp

More information

EDUCATION ACADEMIC DEGREE

EDUCATION ACADEMIC DEGREE Akihiko YAMAGUCHI Address: Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma-shi, Nara, JAPAN 630-0192 Phone: +81-(0)743-72-5376 E-mail: akihiko-y@is.naist.jp EDUCATION 2002.4.1-2006.3.24:

More information

Adaptive Motion Control with Visual Feedback for a Humanoid Robot

Adaptive Motion Control with Visual Feedback for a Humanoid Robot The 21 IEEE/RSJ International Conference on Intelligent Robots and Systems October 18-22, 21, Taipei, Taiwan Adaptive Motion Control with Visual Feedback for a Humanoid Robot Heinrich Mellmann* and Yuan

More information

Graphical Simulation and High-Level Control of Humanoid Robots

Graphical Simulation and High-Level Control of Humanoid Robots In Proc. 2000 IEEE RSJ Int l Conf. on Intelligent Robots and Systems (IROS 2000) Graphical Simulation and High-Level Control of Humanoid Robots James J. Kuffner, Jr. Satoshi Kagami Masayuki Inaba Hirochika

More information

Humanoid Robot HanSaRam: Recent Development and Compensation for the Landing Impact Force by Time Domain Passivity Approach

Humanoid Robot HanSaRam: Recent Development and Compensation for the Landing Impact Force by Time Domain Passivity Approach Humanoid Robot HanSaRam: Recent Development and Compensation for the Landing Impact Force by Time Domain Passivity Approach Yong-Duk Kim, Bum-Joo Lee, Seung-Hwan Choi, In-Won Park, and Jong-Hwan Kim Robot

More information

Development of a Humanoid Biped Walking Robot Platform KHR-1 - Initial Design and Its Performance Evaluation

Development of a Humanoid Biped Walking Robot Platform KHR-1 - Initial Design and Its Performance Evaluation Development of a Humanoid Biped Walking Robot Platform KHR-1 - Initial Design and Its Performance Evaluation Jung-Hoon Kim, Seo-Wook Park, Ill-Woo Park, and Jun-Ho Oh Machine Control Laboratory, Department

More information

Converting Motion between Different Types of Humanoid Robots Using Genetic Algorithms

Converting Motion between Different Types of Humanoid Robots Using Genetic Algorithms Converting Motion between Different Types of Humanoid Robots Using Genetic Algorithms Mari Nishiyama and Hitoshi Iba Abstract The imitation between different types of robots remains an unsolved task for

More information

Team Description 2006 for Team RO-PE A

Team Description 2006 for Team RO-PE A Team Description 2006 for Team RO-PE A Chew Chee-Meng, Samuel Mui, Lim Tongli, Ma Chongyou, and Estella Ngan National University of Singapore, 119260 Singapore {mpeccm, g0500307, u0204894, u0406389, u0406316}@nus.edu.sg

More information

ROBOTICS ENG YOUSEF A. SHATNAWI INTRODUCTION

ROBOTICS ENG YOUSEF A. SHATNAWI INTRODUCTION ROBOTICS INTRODUCTION THIS COURSE IS TWO PARTS Mobile Robotics. Locomotion (analogous to manipulation) (Legged and wheeled robots). Navigation and obstacle avoidance algorithms. Robot Vision Sensors and

More information

Motion Generation for Pulling a Fire Hose by a Humanoid Robot

Motion Generation for Pulling a Fire Hose by a Humanoid Robot Motion Generation for Pulling a Fire Hose by a Humanoid Robot Ixchel G. Ramirez-Alpizar 1, Maximilien Naveau 2, Christophe Benazeth 2, Olivier Stasse 2, Jean-Paul Laumond 2, Kensuke Harada 1, and Eiichi

More information

Team TH-MOS. Liu Xingjie, Wang Qian, Qian Peng, Shi Xunlei, Cheng Jiakai Department of Engineering physics, Tsinghua University, Beijing, China

Team TH-MOS. Liu Xingjie, Wang Qian, Qian Peng, Shi Xunlei, Cheng Jiakai Department of Engineering physics, Tsinghua University, Beijing, China Team TH-MOS Liu Xingjie, Wang Qian, Qian Peng, Shi Xunlei, Cheng Jiakai Department of Engineering physics, Tsinghua University, Beijing, China Abstract. This paper describes the design of the robot MOS

More information

Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units

Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units Reinforcement Learning Approach to Generate Goal-directed Locomotion of a Snake-Like Robot with Screw-Drive Units Sromona Chatterjee, Timo Nachstedt, Florentin Wörgötter, Minija Tamosiunaite, Poramate

More information

Team TH-MOS Abstract. Keywords. 1 Introduction 2 Hardware and Electronics

Team TH-MOS Abstract. Keywords. 1 Introduction 2 Hardware and Electronics Team TH-MOS Pei Ben, Cheng Jiakai, Shi Xunlei, Zhang wenzhe, Liu xiaoming, Wu mian Department of Mechanical Engineering, Tsinghua University, Beijing, China Abstract. This paper describes the design of

More information

Kid-Size Humanoid Soccer Robot Design by TKU Team

Kid-Size Humanoid Soccer Robot Design by TKU Team Kid-Size Humanoid Soccer Robot Design by TKU Team Ching-Chang Wong, Kai-Hsiang Huang, Yueh-Yang Hu, and Hsiang-Min Chan Department of Electrical Engineering, Tamkang University Tamsui, Taipei, Taiwan E-mail:

More information

Motion Generation for Pulling a Fire Hose by a Humanoid Robot

Motion Generation for Pulling a Fire Hose by a Humanoid Robot 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids) Cancun, Mexico, Nov 15-17, 2016 Motion Generation for Pulling a Fire Hose by a Humanoid Robot Ixchel G. Ramirez-Alpizar 1, Maximilien

More information

Performance Assessment of a 3 DOF Differential Based. Waist joint for the icub Baby Humanoid Robot

Performance Assessment of a 3 DOF Differential Based. Waist joint for the icub Baby Humanoid Robot Performance Assessment of a 3 DOF Differential Based Waist joint for the icub Baby Humanoid Robot W. M. Hinojosa, N. G. Tsagarakis, Giorgio Metta, Francesco Becchi, Julio Sandini and Darwin. G. Caldwell

More information

Robust Haptic Teleoperation of a Mobile Manipulation Platform

Robust Haptic Teleoperation of a Mobile Manipulation Platform Robust Haptic Teleoperation of a Mobile Manipulation Platform Jaeheung Park and Oussama Khatib Stanford AI Laboratory Stanford University http://robotics.stanford.edu Abstract. This paper presents a new

More information

Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping

Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping Robotics and Autonomous Systems 54 (2006) 414 418 www.elsevier.com/locate/robot Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping Masaki Ogino

More information

HUMANOID ROBOT SIMULATOR: A REALISTIC DYNAMICS APPROACH. José L. Lima, José C. Gonçalves, Paulo G. Costa, A. Paulo Moreira

HUMANOID ROBOT SIMULATOR: A REALISTIC DYNAMICS APPROACH. José L. Lima, José C. Gonçalves, Paulo G. Costa, A. Paulo Moreira HUMANOID ROBOT SIMULATOR: A REALISTIC DYNAMICS APPROACH José L. Lima, José C. Gonçalves, Paulo G. Costa, A. Paulo Moreira Department of Electrical Engineering Faculty of Engineering of University of Porto

More information

A Nonlinear PID Stabilizer With Spherical Projection for Humanoids: From Concept to Real-time Experiments

A Nonlinear PID Stabilizer With Spherical Projection for Humanoids: From Concept to Real-time Experiments A Nonlinear PID Stabilizer With Spherical Projection for Humanoids: From Concept to Real-time Experiments David Galdeano 1, Ahmed Chemori 1, Sébastien Krut 1 and Philippe Fraisse 1 Abstract This paper

More information

Humanoids. Lecture Outline. RSS 2010 Lecture # 19 Una-May O Reilly. Definition and motivation. Locomotion. Why humanoids? What are humanoids?

Humanoids. Lecture Outline. RSS 2010 Lecture # 19 Una-May O Reilly. Definition and motivation. Locomotion. Why humanoids? What are humanoids? Humanoids RSS 2010 Lecture # 19 Una-May O Reilly Lecture Outline Definition and motivation Why humanoids? What are humanoids? Examples Locomotion RSS 2010 Humanoids Lecture 1 1 Why humanoids? Capek, Paris

More information

HfutEngine3D Soccer Simulation Team Description Paper 2012

HfutEngine3D Soccer Simulation Team Description Paper 2012 HfutEngine3D Soccer Simulation Team Description Paper 2012 Pengfei Zhang, Qingyuan Zhang School of Computer and Information Hefei University of Technology, China Abstract. This paper simply describes the

More information

Reinforcement Learning Simulations and Robotics

Reinforcement Learning Simulations and Robotics Reinforcement Learning Simulations and Robotics Models Partially observable noise in sensors Policy search methods rather than value functionbased approaches Isolate key parameters by choosing an appropriate

More information

An Improved Path Planning Method Based on Artificial Potential Field for a Mobile Robot

An Improved Path Planning Method Based on Artificial Potential Field for a Mobile Robot BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 15, No Sofia 015 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-015-0037 An Improved Path Planning Method Based

More information

Behavior generation for a mobile robot based on the adaptive fitness function

Behavior generation for a mobile robot based on the adaptive fitness function Robotics and Autonomous Systems 40 (2002) 69 77 Behavior generation for a mobile robot based on the adaptive fitness function Eiji Uchibe a,, Masakazu Yanase b, Minoru Asada c a Human Information Science

More information

T=r, ankle joint 6-axis force sensor

T=r, ankle joint 6-axis force sensor Proceedings of the 2001 EEE nternational Conference on Robotics & Automation Seoul, Korea. May 21-26, 2001 Balancing a Humanoid Robot Using Backdrive Concerned Torque Control and Direct Angular Momentum

More information

Robots Learning from Robots: A proof of Concept Study for Co-Manipulation Tasks. Luka Peternel and Arash Ajoudani Presented by Halishia Chugani

Robots Learning from Robots: A proof of Concept Study for Co-Manipulation Tasks. Luka Peternel and Arash Ajoudani Presented by Halishia Chugani Robots Learning from Robots: A proof of Concept Study for Co-Manipulation Tasks Luka Peternel and Arash Ajoudani Presented by Halishia Chugani Robots learning from humans 1. Robots learn from humans 2.

More information

On Observer-based Passive Robust Impedance Control of a Robot Manipulator

On Observer-based Passive Robust Impedance Control of a Robot Manipulator Journal of Mechanics Engineering and Automation 7 (2017) 71-78 doi: 10.17265/2159-5275/2017.02.003 D DAVID PUBLISHING On Observer-based Passive Robust Impedance Control of a Robot Manipulator CAO Sheng,

More information

A Posture Control for Two Wheeled Mobile Robots

A Posture Control for Two Wheeled Mobile Robots Transactions on Control, Automation and Systems Engineering Vol., No. 3, September, A Posture Control for Two Wheeled Mobile Robots Hyun-Sik Shim and Yoon-Gyeoung Sung Abstract In this paper, a posture

More information

Birth of An Intelligent Humanoid Robot in Singapore

Birth of An Intelligent Humanoid Robot in Singapore Birth of An Intelligent Humanoid Robot in Singapore Ming Xie Nanyang Technological University Singapore 639798 Email: mmxie@ntu.edu.sg Abstract. Since 1996, we have embarked into the journey of developing

More information

Pr Yl. Rl Pl. 200mm mm. 400mm. 70mm. 120mm

Pr Yl. Rl Pl. 200mm mm. 400mm. 70mm. 120mm Humanoid Robot Mechanisms for Responsive Mobility M.OKADA 1, T.SHINOHARA 1, T.GOTOH 1, S.BAN 1 and Y.NAKAMURA 12 1 Dept. of Mechano-Informatics, Univ. of Tokyo., 7-3-1 Hongo Bunkyo-ku Tokyo, 113-8656 Japan

More information

Tasks prioritization for whole-body realtime imitation of human motion by humanoid robots

Tasks prioritization for whole-body realtime imitation of human motion by humanoid robots Tasks prioritization for whole-body realtime imitation of human motion by humanoid robots Sophie SAKKA 1, Louise PENNA POUBEL 2, and Denis ĆEHAJIĆ3 1 IRCCyN and University of Poitiers, France 2 ECN and

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

sin( x m cos( The position of the mass point D is specified by a set of state variables, (θ roll, θ pitch, r) related to the Cartesian coordinates by:

sin( x m cos( The position of the mass point D is specified by a set of state variables, (θ roll, θ pitch, r) related to the Cartesian coordinates by: Research Article International Journal of Current Engineering and Technology ISSN 77-46 3 INPRESSCO. All Rights Reserved. Available at http://inpressco.com/category/ijcet Modeling improvement of a Humanoid

More information

Antennas and Propagation. Chapter 6b: Path Models Rayleigh, Rician Fading, MIMO

Antennas and Propagation. Chapter 6b: Path Models Rayleigh, Rician Fading, MIMO Antennas and Propagation b: Path Models Rayleigh, Rician Fading, MIMO Introduction From last lecture How do we model H p? Discrete path model (physical, plane waves) Random matrix models (forget H p and

More information

Mechanical Design of Humanoid Robot Platform KHR-3 (KAIST Humanoid Robot - 3: HUBO) *

Mechanical Design of Humanoid Robot Platform KHR-3 (KAIST Humanoid Robot - 3: HUBO) * Proceedings of 2005 5th IEEE-RAS International Conference on Humanoid Robots Mechanical Design of Humanoid Robot Platform KHR-3 (KAIST Humanoid Robot - 3: HUBO) * Ill-Woo Park, Jung-Yup Kim, Jungho Lee

More information

A Comparison of Particle Swarm Optimization and Gradient Descent in Training Wavelet Neural Network to Predict DGPS Corrections

A Comparison of Particle Swarm Optimization and Gradient Descent in Training Wavelet Neural Network to Predict DGPS Corrections Proceedings of the World Congress on Engineering and Computer Science 00 Vol I WCECS 00, October 0-, 00, San Francisco, USA A Comparison of Particle Swarm Optimization and Gradient Descent in Training

More information

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation Hiroshi Ishiguro Department of Information Science, Kyoto University Sakyo-ku, Kyoto 606-01, Japan E-mail: ishiguro@kuis.kyoto-u.ac.jp

More information

Interconnection Structure Optimization for Neural Oscillator Based Biped Robot Locomotion

Interconnection Structure Optimization for Neural Oscillator Based Biped Robot Locomotion 2015 IEEE Symposium Series on Computational Intelligence Interconnection Structure Optimization for Neural Oscillator Based Biped Robot Locomotion Azhar Aulia Saputra 1, Indra Adji Sulistijono 2, Janos

More information

Design and Implementation of a Simplified Humanoid Robot with 8 DOF

Design and Implementation of a Simplified Humanoid Robot with 8 DOF Design and Implementation of a Simplified Humanoid Robot with 8 DOF Hari Krishnan R & Vallikannu A. L Department of Electronics and Communication Engineering, Hindustan Institute of Technology and Science,

More information

Cooperative Transportation by Humanoid Robots Learning to Correct Positioning

Cooperative Transportation by Humanoid Robots Learning to Correct Positioning Cooperative Transportation by Humanoid Robots Learning to Correct Positioning Yutaka Inoue, Takahiro Tohge, Hitoshi Iba Department of Frontier Informatics, Graduate School of Frontier Sciences, The University

More information

A Passive System Approach to Increase the Energy Efficiency in Walk Movements Based in a Realistic Simulation Environment

A Passive System Approach to Increase the Energy Efficiency in Walk Movements Based in a Realistic Simulation Environment A Passive System Approach to Increase the Energy Efficiency in Walk Movements Based in a Realistic Simulation Environment José L. Lima, José A. Gonçalves, Paulo G. Costa and A. Paulo Moreira Abstract This

More information

Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot

Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot Autonomous Stair Climbing Algorithm for a Small Four-Tracked Robot Quy-Hung Vu, Byeong-Sang Kim, Jae-Bok Song Korea University 1 Anam-dong, Seongbuk-gu, Seoul, Korea vuquyhungbk@yahoo.com, lovidia@korea.ac.kr,

More information

Team Description for Humanoid KidSize League of RoboCup Stephen McGill, Seung Joon Yi, Yida Zhang, Aditya Sreekumar, and Professor Dan Lee

Team Description for Humanoid KidSize League of RoboCup Stephen McGill, Seung Joon Yi, Yida Zhang, Aditya Sreekumar, and Professor Dan Lee Team DARwIn Team Description for Humanoid KidSize League of RoboCup 2013 Stephen McGill, Seung Joon Yi, Yida Zhang, Aditya Sreekumar, and Professor Dan Lee GRASP Lab School of Engineering and Applied Science,

More information

Shuguang Huang, Ph.D Research Assistant Professor Department of Mechanical Engineering Marquette University Milwaukee, WI

Shuguang Huang, Ph.D Research Assistant Professor Department of Mechanical Engineering Marquette University Milwaukee, WI Shuguang Huang, Ph.D Research Assistant Professor Department of Mechanical Engineering Marquette University Milwaukee, WI 53201 huangs@marquette.edu RESEARCH INTEREST: Dynamic systems. Analysis and physical

More information

Stabilize humanoid robot teleoperated by a RGB-D sensor

Stabilize humanoid robot teleoperated by a RGB-D sensor Stabilize humanoid robot teleoperated by a RGB-D sensor Andrea Bisson, Andrea Busatto, Stefano Michieletto, and Emanuele Menegatti Intelligent Autonomous Systems Lab (IAS-Lab) Department of Information

More information

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization

Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization Swarm Intelligence W7: Application of Machine- Learning Techniques to Automatic Control Design and Optimization Learning to avoid obstacles Outline Problem encoding using GA and ANN Floreano and Mondada

More information

Active Stabilization of a Humanoid Robot for Impact Motions with Unknown Reaction Forces

Active Stabilization of a Humanoid Robot for Impact Motions with Unknown Reaction Forces 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems October 7-12, 2012. Vilamoura, Algarve, Portugal Active Stabilization of a Humanoid Robot for Impact Motions with Unknown Reaction

More information

Reinforcement Learning for CPS Safety Engineering. Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara

Reinforcement Learning for CPS Safety Engineering. Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara Reinforcement Learning for CPS Safety Engineering Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara Motivations Safety-critical duties desired by CPS? Autonomous vehicle control:

More information

CB: a humanoid research platform for exploring neuroscience

CB: a humanoid research platform for exploring neuroscience Advanced Robotics, Vol. 21, No. 10, pp. 1097 1114 (2007) VSP and Robotics Society of Japan 2007. Also available online - www.brill.nl/ar Full paper CB: a humanoid research platform for exploring neuroscience

More information

A Compact Model for the Compliant Humanoid Robot COMAN

A Compact Model for the Compliant Humanoid Robot COMAN The Fourth IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics Roma, Italy. June 24-27, 212 A Compact for the Compliant Humanoid Robot COMAN Luca Colasanto, Nikos G. Tsagarakis,

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

Optimal Control System Design

Optimal Control System Design Chapter 6 Optimal Control System Design 6.1 INTRODUCTION The active AFO consists of sensor unit, control system and an actuator. While designing the control system for an AFO, a trade-off between the transient

More information

CONTROL IMPROVEMENT OF UNDER-DAMPED SYSTEMS AND STRUCTURES BY INPUT SHAPING

CONTROL IMPROVEMENT OF UNDER-DAMPED SYSTEMS AND STRUCTURES BY INPUT SHAPING CONTROL IMPROVEMENT OF UNDER-DAMPED SYSTEMS AND STRUCTURES BY INPUT SHAPING Igor Arolovich a, Grigory Agranovich b Ariel University of Samaria a igor.arolovich@outlook.com, b agr@ariel.ac.il Abstract -

More information

Introduction to Robotics

Introduction to Robotics Jianwei Zhang zhang@informatik.uni-hamburg.de Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme 14. June 2013 J. Zhang 1 Robot Control

More information

IN MOST human robot coordination systems that have

IN MOST human robot coordination systems that have IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 54, NO. 2, APRIL 2007 699 Dance Step Estimation Method Based on HMM for Dance Partner Robot Takahiro Takeda, Student Member, IEEE, Yasuhisa Hirata, Member,

More information

DATA ACQUISITION FOR STOCHASTIC LOCALIZATION OF WIRELESS MOBILE CLIENT IN MULTISTORY BUILDING

DATA ACQUISITION FOR STOCHASTIC LOCALIZATION OF WIRELESS MOBILE CLIENT IN MULTISTORY BUILDING DATA ACQUISITION FOR STOCHASTIC LOCALIZATION OF WIRELESS MOBILE CLIENT IN MULTISTORY BUILDING Tomohiro Umetani 1 *, Tomoya Yamashita, and Yuichi Tamura 1 1 Department of Intelligence and Informatics, Konan

More information

Biomimetic Design of Actuators, Sensors and Robots

Biomimetic Design of Actuators, Sensors and Robots Biomimetic Design of Actuators, Sensors and Robots Takashi Maeno, COE Member of autonomous-cooperative robotics group Department of Mechanical Engineering Keio University Abstract Biological life has greatly

More information

Development of a Simulator of Environment and Measurement for Autonomous Mobile Robots Considering Camera Characteristics

Development of a Simulator of Environment and Measurement for Autonomous Mobile Robots Considering Camera Characteristics Development of a Simulator of Environment and Measurement for Autonomous Mobile Robots Considering Camera Characteristics Kazunori Asanuma 1, Kazunori Umeda 1, Ryuichi Ueda 2, and Tamio Arai 2 1 Chuo University,

More information

The Humanoid Robot ARMAR: Design and Control

The Humanoid Robot ARMAR: Design and Control The Humanoid Robot ARMAR: Design and Control Tamim Asfour, Karsten Berns, and Rüdiger Dillmann Forschungszentrum Informatik Karlsruhe, Haid-und-Neu-Str. 10-14 D-76131 Karlsruhe, Germany asfour,dillmann

More information

Adaptive Inverse Control with IMC Structure Implementation on Robotic Arm Manipulator

Adaptive Inverse Control with IMC Structure Implementation on Robotic Arm Manipulator Adaptive Inverse Control with IMC Structure Implementation on Robotic Arm Manipulator Khalid M. Al-Zahrani echnical Support Unit erminal Department, Saudi Aramco P.O. Box 94 (Najmah), Ras anura, Saudi

More information

Mekanisme Robot - 3 SKS (Robot Mechanism)

Mekanisme Robot - 3 SKS (Robot Mechanism) Mekanisme Robot - 3 SKS (Robot Mechanism) Latifah Nurahmi, PhD!! latifah.nurahmi@gmail.com!! C.250 First Term - 2016/2017 Velocity Rate of change of position and orientation with respect to time Linear

More information

Navigation of Transport Mobile Robot in Bionic Assembly System

Navigation of Transport Mobile Robot in Bionic Assembly System Navigation of Transport Mobile obot in Bionic ssembly System leksandar Lazinica Intelligent Manufacturing Systems IFT Karlsplatz 13/311, -1040 Vienna Tel : +43-1-58801-311141 Fax :+43-1-58801-31199 e-mail

More information

Adaptive Dynamic Simulation Framework for Humanoid Robots

Adaptive Dynamic Simulation Framework for Humanoid Robots Adaptive Dynamic Simulation Framework for Humanoid Robots Manokhatiphaisan S. and Maneewarn T. Abstract This research proposes the dynamic simulation system framework with a robot-in-the-loop concept.

More information

Hardware Experiments of Humanoid Robot Safe Fall Using Aldebaran NAO

Hardware Experiments of Humanoid Robot Safe Fall Using Aldebaran NAO Hardware Experiments of Humanoid Robot Safe Fall Using Aldebaran NAO Seung-Kook Yun and Ambarish Goswami Abstract Although the fall of a humanoid robot is rare in controlled environments, it cannot be

More information

Active Stabilization of a Humanoid Robot for Real-Time Imitation of a Human Operator

Active Stabilization of a Humanoid Robot for Real-Time Imitation of a Human Operator 2012 12th IEEE-RAS International Conference on Humanoid Robots Nov.29-Dec.1, 2012. Business Innovation Center Osaka, Japan Active Stabilization of a Humanoid Robot for Real-Time Imitation of a Human Operator

More information

Biologically Inspired Embodied Evolution of Survival

Biologically Inspired Embodied Evolution of Survival Biologically Inspired Embodied Evolution of Survival Stefan Elfwing 1,2 Eiji Uchibe 2 Kenji Doya 2 Henrik I. Christensen 1 1 Centre for Autonomous Systems, Numerical Analysis and Computer Science, Royal

More information

Active sway control of a gantry crane using hybrid input shaping and PID control schemes

Active sway control of a gantry crane using hybrid input shaping and PID control schemes Home Search Collections Journals About Contact us My IOPscience Active sway control of a gantry crane using hybrid input shaping and PID control schemes This content has been downloaded from IOPscience.

More information

Chapter 2 Introduction to Haptics 2.1 Definition of Haptics

Chapter 2 Introduction to Haptics 2.1 Definition of Haptics Chapter 2 Introduction to Haptics 2.1 Definition of Haptics The word haptic originates from the Greek verb hapto to touch and therefore refers to the ability to touch and manipulate objects. The haptic

More information

REINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING

REINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING REINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING RIKA ANTONOVA ANTONOVA@KTH.SE ALI GHADIRZADEH ALGH@KTH.SE RL: What We Know So Far Formulate the problem as an MDP (or POMDP) State space captures

More information

Teaching Mechanical Students to Build and Analyze Motor Controllers

Teaching Mechanical Students to Build and Analyze Motor Controllers Teaching Mechanical Students to Build and Analyze Motor Controllers Hugh Jack, Associate Professor Padnos School of Engineering Grand Valley State University Grand Rapids, MI email: jackh@gvsu.edu Session

More information

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute Jane Li Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute (6 pts )A 2-DOF manipulator arm is attached to a mobile base with non-holonomic

More information

HMM-based Error Recovery of Dance Step Selection for Dance Partner Robot

HMM-based Error Recovery of Dance Step Selection for Dance Partner Robot 27 IEEE International Conference on Robotics and Automation Roma, Italy, 1-14 April 27 ThA4.3 HMM-based Error Recovery of Dance Step Selection for Dance Partner Robot Takahiro Takeda, Yasuhisa Hirata,

More information

Using Policy Gradient Reinforcement Learning on Autonomous Robot Controllers

Using Policy Gradient Reinforcement Learning on Autonomous Robot Controllers Using Policy Gradient Reinforcement on Autonomous Robot Controllers Gregory Z. Grudic Department of Computer Science University of Colorado Boulder, CO 80309-0430 USA Lyle Ungar Computer and Information

More information

Identification of a Piecewise Controller of Lateral Human Standing Based on Returning Recursive-Least-Square Method

Identification of a Piecewise Controller of Lateral Human Standing Based on Returning Recursive-Least-Square Method IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) November -,. Tokyo, Japan Identification of a Piecewise Controller of Lateral Human Standing Based on Returning Recursive-Least-Square

More information

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts

Traffic Control for a Swarm of Robots: Avoiding Group Conflicts Traffic Control for a Swarm of Robots: Avoiding Group Conflicts Leandro Soriano Marcolino and Luiz Chaimowicz Abstract A very common problem in the navigation of robotic swarms is when groups of robots

More information

Multi-robot Formation Control Based on Leader-follower Method

Multi-robot Formation Control Based on Leader-follower Method Journal of Computers Vol. 29 No. 2, 2018, pp. 233-240 doi:10.3966/199115992018042902022 Multi-robot Formation Control Based on Leader-follower Method Xibao Wu 1*, Wenbai Chen 1, Fangfang Ji 1, Jixing Ye

More information

AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS. Nuno Sousa Eugénio Oliveira

AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS. Nuno Sousa Eugénio Oliveira AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS Nuno Sousa Eugénio Oliveira Faculdade de Egenharia da Universidade do Porto, Portugal Abstract: This paper describes a platform that enables

More information

Speed Control of a Pneumatic Monopod using a Neural Network

Speed Control of a Pneumatic Monopod using a Neural Network Tech. Rep. IRIS-2-43 Institute for Robotics and Intelligent Systems, USC, 22 Speed Control of a Pneumatic Monopod using a Neural Network Kale Harbick and Gaurav S. Sukhatme! Robotic Embedded Systems Laboratory

More information

Robot Joint Angle Control Based on Self Resonance Cancellation Using Double Encoders

Robot Joint Angle Control Based on Self Resonance Cancellation Using Double Encoders Robot Joint Angle Control Based on Self Resonance Cancellation Using Double Encoders Akiyuki Hasegawa, Hiroshi Fujimoto and Taro Takahashi 2 Abstract Research on the control using a load-side encoder for

More information

Nao Devils Dortmund. Team Description for RoboCup Matthias Hofmann, Ingmar Schwarz, and Oliver Urbann

Nao Devils Dortmund. Team Description for RoboCup Matthias Hofmann, Ingmar Schwarz, and Oliver Urbann Nao Devils Dortmund Team Description for RoboCup 2014 Matthias Hofmann, Ingmar Schwarz, and Oliver Urbann Robotics Research Institute Section Information Technology TU Dortmund University 44221 Dortmund,

More information

ECE 517: Reinforcement Learning in Artificial Intelligence

ECE 517: Reinforcement Learning in Artificial Intelligence ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 17: Case Studies and Gradient Policy October 29, 2015 Dr. Itamar Arel College of Engineering Department of Electrical Engineering and

More information

Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots

Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots Simple Path Planning Algorithm for Two-Wheeled Differentially Driven (2WDD) Soccer Robots Gregor Novak 1 and Martin Seyr 2 1 Vienna University of Technology, Vienna, Austria novak@bluetechnix.at 2 Institute

More information

Kalman Filtering, Factor Graphs and Electrical Networks

Kalman Filtering, Factor Graphs and Electrical Networks Kalman Filtering, Factor Graphs and Electrical Networks Pascal O. Vontobel, Daniel Lippuner, and Hans-Andrea Loeliger ISI-ITET, ETH urich, CH-8092 urich, Switzerland. Abstract Factor graphs are graphical

More information

Steering a humanoid robot by its head

Steering a humanoid robot by its head University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part B Faculty of Engineering and Information Sciences 2009 Steering a humanoid robot by its head Manish

More information

Perception. Read: AIMA Chapter 24 & Chapter HW#8 due today. Vision

Perception. Read: AIMA Chapter 24 & Chapter HW#8 due today. Vision 11-25-2013 Perception Vision Read: AIMA Chapter 24 & Chapter 25.3 HW#8 due today visual aural haptic & tactile vestibular (balance: equilibrium, acceleration, and orientation wrt gravity) olfactory taste

More information

Dynamic analysis and control of a Hybrid serial/cable driven robot for lower-limb rehabilitation

Dynamic analysis and control of a Hybrid serial/cable driven robot for lower-limb rehabilitation Dynamic analysis and control of a Hybrid serial/cable driven robot for lower-limb rehabilitation M. Ismail 1, S. Lahouar 2 and L. Romdhane 1,3 1 Mechanical Laboratory of Sousse (LMS), National Engineering

More information

Model-based Fall Detection and Fall Prevention for Humanoid Robots

Model-based Fall Detection and Fall Prevention for Humanoid Robots Model-based Fall Detection and Fall Prevention for Humanoid Robots Thomas Muender 1, Thomas Röfer 1,2 1 Universität Bremen, Fachbereich 3 Mathematik und Informatik, Postfach 330 440, 28334 Bremen, Germany

More information