Keywords : Simultaneous perturbation, Neural networks, Neuro-controller, Real-time, Flexible arm. w u. (a)learning by the back-propagation.

Similar documents
Pulse Density Recurrent Neural Network Systems with Learning Capability Using FPGA

Adaptive Antennas in Wireless Communication Networks

(i) Understanding the basic concepts of signal modeling, correlation, maximum likelihood estimation, least squares and iterative numerical methods

NEURO-ACTIVE NOISE CONTROL USING A DECOUPLED LINEAIUNONLINEAR SYSTEM APPROACH

CHASSIS DYNAMOMETER TORQUE CONTROL SYSTEM DESIGN BY DIRECT INVERSE COMPENSATION. C.Matthews, P.Dickinson, A.T.Shenton

Neural Blind Separation for Electromagnetic Source Localization and Assessment

PID Controller Design Based on Radial Basis Function Neural Networks for the Steam Generator Level Control

A Prototype Wire Position Monitoring System

Hardware Implementation of a PCA Learning Network by an Asynchronous PDM Digital Circuit

Analysis of LMS and NLMS Adaptive Beamforming Algorithms

Advanced delay-and-sum beamformer with deep neural network

Lab/Project Error Control Coding using LDPC Codes and HARQ

Adaptive Neural Network-based Synchronization Control for Dual-drive Servo System

A C. Wallner Siemens AG Berlin, (Germany)

Digital inertial algorithm for recording track geometry on commercial shinkansen trains

IMPLEMENTATION OF NEURAL NETWORK IN ENERGY SAVING OF INDUCTION MOTOR DRIVES WITH INDIRECT VECTOR CONTROL

Paper CMOS Image Sensor with Pseudorandom Pixel Placement for Image Measurement using Hough Transform

Noise Cancellation using Adaptive Filter Base On Neural Networks

Spoofing GPS Receiver Clock Offset of Phasor Measurement Units 1

On Observer-based Passive Robust Impedance Control of a Robot Manipulator

Reduction of flicker effect in wind power plants with doubly fed machines

Transmit Power Allocation for BER Performance Improvement in Multicarrier Systems

A New Approach for Control of IPFC for Power Flow Management

DECENTRALIZED CONTROL OF STRUCTURAL ACOUSTIC RADIATION

MITIGATING INTERFERENCE TO GPS OPERATION USING VARIABLE FORGETTING FACTOR BASED RECURSIVE LEAST SQUARES ESTIMATION

Chapter 2 Channel Equalization

Abstract: PWM Inverters need an internal current feedback loop to maintain desired

ROBUST SERVO CONTROL DESIGN USING THE H /µ METHOD 1

A Robust Neural Fuzzy Petri Net Controller For A Temperature Control System

CONTROL IMPROVEMENT OF UNDER-DAMPED SYSTEMS AND STRUCTURES BY INPUT SHAPING

A high-resolution fringe printer for studying synthetic holograms

SIMULATION OF D-STATCOM AND DVR IN POWER SYSTEMS

Development of innovative fringe locking strategies for vibration-resistant white light vertical scanning interferometry (VSI)

Implementation of a Choquet Fuzzy Integral Based Controller on a Real Time System

Speed Control of Induction Motor using Multilevel Inverter

SRV02-Series Rotary Experiment # 3. Ball & Beam. Student Handout

Neural Network Adaptive Control for X-Y Position Platform with Uncertainty

A Detailed Model of The Space Vector Modulated Control Of A VVVF Controlled Ac Machine Including The Overmodulation Region

Available online at ScienceDirect. Procedia Computer Science 76 (2015 ) 2 8

Shunt active filter algorithms for a three phase system fed to adjustable speed drive

Impact of Interference Model on Capacity in CDMA Cellular Networks

High-speed Noise Cancellation with Microphone Array

Online Automatic Gauge Controller Tuning Method by using Neuro-Fuzzy Model in a Hot Rolling Plant

ISMCR2004. Abstract. 2. The mechanism of the master-slave arm of Telesar II. 1. Introduction. D21-Page 1

Finite Word Length Effects on Two Integer Discrete Wavelet Transform Algorithms. Armein Z. R. Langi

Application Research on BP Neural Network PID Control of the Belt Conveyor

ROBUST CONTROL DESIGN FOR ACTIVE NOISE CONTROL SYSTEMS OF DUCTS WITH A VENTILATION SYSTEM USING A PAIR OF LOUDSPEAKERS

The Research on Servo Control System for AC PMSM Based on DSP BaiLei1, a, Wengang Zheng2, b

CHAPTER 4 MONITORING OF POWER SYSTEM VOLTAGE STABILITY THROUGH ARTIFICIAL NEURAL NETWORK TECHNIQUE

More Info at Open Access Database by S. Dutta and T. Schmidt

Disturbance Rejection Using Self-Tuning ARMARKOV Adaptive Control with Simultaneous Identification

NEURAL NETWORK BASED LOAD FREQUENCY CONTROL FOR RESTRUCTURING POWER INDUSTRY

The Haptic Impendance Control through Virtual Environment Force Compensation

Acoustic Echo Cancellation using LMS Algorithm

A Feasibility Study of Time-Domain Passivity Approach for Bilateral Teleoperation of Mobile Manipulator

Performance Evaluation of different α value for OFDM System

Performance Comparison of ZF, LMS and RLS Algorithms for Linear Adaptive Equalizer

A Comparison of Particle Swarm Optimization and Gradient Descent in Training Wavelet Neural Network to Predict DGPS Corrections

INSTANTANEOUS FREQUENCY ESTIMATION FOR A SINUSOIDAL SIGNAL COMBINING DESA-2 AND NOTCH FILTER. Yosuke SUGIURA, Keisuke USUKURA, Naoyuki AIKAWA

CHAPTER 4 CONTROL ALGORITHM FOR PROPOSED H-BRIDGE MULTILEVEL INVERTER

Keywords: cylindrical near-field acquisition, mechanical and electrical errors, uncertainty, directivity.

Sonia Sharma ECE Department, University Institute of Engineering and Technology, MDU, Rohtak, India. Fig.1.Neuron and its connection

The Selective Harmonic Elimination Technique for Harmonic Reduction of Multilevel Inverter Using PSO Algorithm

IN heating, ventilating, and air-conditioning (HVAC) systems,

Distributed Power Control in Cellular and Wireless Networks - A Comparative Study

Multi Modulus Blind Equalizations for Quadrature Amplitude Modulation

Learning New Articulator Trajectories for a Speech Production Model using Artificial Neural Networks

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

Intelligent Traffic Sign Detector: Adaptive Learning Based on Online Gathering of Training Samples

CHAPTER 6 UNIT VECTOR GENERATION FOR DETECTING VOLTAGE ANGLE

Free vibration of cantilever beam FREE VIBRATION OF CANTILEVER BEAM PROCEDURE

Hardware Implementation of a Neural Network Controller with an MCU and an FPGA for Nonlinear Systems

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

Jaswant 1, Sanjeev Dhull 2 1 Research Scholar, Electronics and Communication, GJUS & T, Hisar, Haryana, India; is the corr-esponding author.

PERFORMANCE ANALYSIS OF SVPWM AND FUZZY CONTROLLED HYBRID ACTIVE POWER FILTER

MCE441/541 Midterm Project Position Control of Rotary Servomechanism

Emergence of Purposive and Grounded Communication through Reinforcement Learning

An Improved Pre-Distortion Algorithm Based On Indirect Learning Architecture for Nonlinear Power Amplifiers Wei You, Daoxing Guo, Yi Xu, Ziping Zhang

Integration Intelligent Estimators to Disturbance Observer to Enhance Robustness of Active Magnetic Bearing Controller

Interaction rule learning with a human partner based on an imitation faculty with a simple visuo-motor mapping

MAGNETIC LEVITATION SUSPENSION CONTROL SYSTEM FOR REACTION WHEEL

Automatic Control Motion control Advanced control techniques

ENHANCED PRECISION IN SOURCE LOCALIZATION BY USING 3D-INTENSITY ARRAY MODULE

INTERSYMBOL interference (ISI) is a significant obstacle

Chapter - 7. Adaptive Channel Equalization

Investigation of negative sequence injection capability in H-bridge Multilevel STATCOM

Towards Brain-inspired Computing

Design Of PID Controller In Automatic Voltage Regulator (AVR) System Using PSO Technique

Modeling and simulation of feed system design of CNC machine tool based on. Matlab/simulink

Performance Analysis of Equalizer Techniques for Modulated Signals

Position Control of a Hydraulic Servo System using PID Control

REMOTE CONTROL OF TRANSMIT BEAMFORMING IN TDD/MIMO SYSTEMS

DESIGN OF A MODE DECOUPLING FOR VOLTAGE CONTROL OF WIND-DRIVEN IG SYSTEM

Analysis of Voltage Source Inverters using Space Vector PWM for Induction Motor Drive

EXPERIMENTAL MODAL AND AERODYNAMIC ANALYSIS OF A LARGE SPAN CABLE-STAYED BRIDGE

Resonant Controller to Minimize THD for PWM Inverter

SIMULATION OF SINGLE PHASE H- BRIDGE INVERTER TO AVOID COMPLEX BEHAVIOUR

Volume 2, Issue 9, September 2014 International Journal of Advance Research in Computer Science and Management Studies

Design of PID Controller for Higher Order Discrete Systems Based on Order Reduction Employing ABC Algorithm

MEM01: DC-Motor Servomechanism

Transcription:

Real-time control and learning using neuro-controller via simultaneous perturbation for flexible arm system. Yutaka Maeda Department of Electrical Engineering, Kansai University 3-3-35 Yamate-cho, Suita Osaka, 564-868 JAPAN. PHONE : +81-6-6368-932, FAX : +81-6-6388-8843 E-MAIL: maedayut@kansai-u.ac.jp Abstract This paper describes details of real-time control and real-time learning of neuro-controller for a flexible arm system using the simultaneous perturbation optimization method. The simultaneous perturbation optimization method is useful, especially when dimension of the parameters to be adjusted is large. Therefore, it is beneficial to utilize the simultaneous perturbation method for neural networks. On the other hand, when we use the ordinary gradient method as a learning rule of the neuro-controller, Jacobian of the plant is essential. However, the learning rule via the simultaneous perturbation does not require Jacobian of an objective plant so that the neural network uses only outputs of an objective system. Actual real-time control and real-time learning results of a real flexible arm system are described to confirm a feasibility of the proposed method. Keywords : Simultaneous perturbation, Neural networks, Neuro-controller, Real-time, Flexible arm 1. Introduction Neural networks (NNs) are recently used in many fields. Especially, its non-linear information processing capability is intriguing. In the field of control, NNs are one of hopeful tools. Neuro-controller (NC) by a direct inverse control scheme(see Fig.1) is one of promising approaches in non-linear control problems. In order to use a NC as a direct controller, the NC must be an inverse system of an objective plant, that is, the NC must learn an inverse system in so-called indirect inverse modeling. Then, the learning rule plays a practical role, because it will be related to an arrangement of overall system. In the case of indirect inverse modeling, generally, we need a plant model or a sensitivity function of the plant to acquire the derivatives needed for learning such as the backpropagation(bp) method, because the error function is usuallefined not by an output of the NN but by that of the plant(see Neurocontroller w Neurocontroller w Jacobian u y + Plant u y + Plant ( u ) (b)learning by the SP method. Fig.1 A basic scheme for a direct neuro-controller. Fig.1(a)). In this paper, we propose a NC using the simultaneous perturbation learning rule for a flexible arm system. This learning rule does not require a derivative of an error function but only values of the error function itself. Therefore, without knowing Jacobian of the objective arm system, we can design a direct neuro-controller. Then the neuro-controller can learn a changing environment about the system as well. Ordinarily, an error function J(w) is defined by a squared error of the plant. When we use a usual gradient method as a learning rule of the NC, we must know the quantity J ( w)/ w. Then, we have J( w) J( w) y f ( u) u = = ( y yd ) w y w u w (1) where denotes the desired output of the plant. Therefore, (y- ) is known. Moerover, we can calculate u / w like the back-propagation learning rule. However, we don t know the sensitivity function. As a resutlt, we can not obtain a proper modifying quantities for weights in the NC. On the other hand, we can introduce an idea of the simultaneous perturbation learning rule. In this f f ( u ) (a)learning by the back-propagation. 2583

case, there is no need to know the sensitivity function of the unknown plant. Only using the values of an error function, the learning rule can estimate a gradient of the error function with respect to adjustable parameters, weights of the NC in this case. Moreover, the algorithm of the simultaneous perturbation is very simple. This implies easy implementation and real-time learning of NCs. Actually, real-time learning of NC was achieved in this research. The NC learns an inverse system of an objective plant and controls the plant at the same time. Even if environment changes, e.g. change of mass of equipment, the NC can adapt new situation by the learning. In usual control scheme such as the state feedback control, if the environment changes, we have to recalculate parameters used in controller such as the feedback gain. From these points of view, NCs using the simultaneous perturbation is significant in the control problems. 2. Simultaneous perturbation learning rule The idea of the simultaneous perturbation was proposed by J.C.Spall as an extension of Kiefer-Wolfowitz stochastic approximation[1][2]. J.Alespector et al. and G.Cauwenberghs also proposed the same idea[3][4]. Independently, Y.Maeda introduced the same algorithm as a learning rule of neural networks[5][6]. J.C.Spall et al. and Y.Maeda reported some applications of the simultaneous perturbation method in control problems[7][8][9]. Now, we describe the simultaneous perturbation leaning rule used in this paper. Define a weight vector and a sign vector as follows; T w () t = ( w1() t, w2() t, L, wn () t ) T s () t = ( s1() t, s2() t, L, sn () t ) (2) Where t denotes iteration, superscript T is transpose of a vector. s (t) is a sign vector whose components are +1 or 1. The i -th component of the modifying vector of the i ( ) weights w t is defined as follows; J( w( t) + cs( t) ) J( w( t) ) wi() t = si() t (3) c Where c is a magnitude of the perturbation. The weights are updated as follows; w( t+ 1) = w( t) α w( t) (4) Where, α is positive learning coefficient. Note that only two values of the error function; J(w(t)) and J(w(t)+cs(t)) are used to update the weights in the network. Any information about the objective plant does not included in the learning rule. In this paper, we adopt the simultaneous perturbation with the sign vector that is equivalent to the random direction type of optimization method. This method is easy to implement. However, we have to pay attention to difference of the simultaneous perturbation and the random direction. 3. Simultaneous perturbation for flexible arm system We consider a one-freedom flexible beam shown in Fig.2 as an objective plant. In usual control scheme, we must have an exact model of an objective plant, since controllers are basicallesigned based on the identified model. Therefore, when some characteristics of the plant change, we must detect the change to compensate the controller. On the other hand, our scheme used here works without this information, because only values of the error function are required. Without a model of the plant or information about the plant, the NC via the simultaneous perturbation can generates proper input for the plant. top I 1 m M 1 r 1 x 1 x θ 2 1 P 1 θ EI Z l Fig.2 A flexible arm M Fig.3 Picture of the flexible arm. P torque τ r 2584

Desired signal θ & θ x1 x& 1 θ & 1 θ 1 1 z 2 z 3 z τ PC Position of Color markers Torque Flexible arm Camera Quick-MAG Picture of Flexible Arm Fig.4 Neural network used here Fig.5 Flexible arm system. Moreover, proper modification of the controller is carried out under operation. That is, on-line learning is possible in this learning scheme. 3.1 The flexible beam A picture of the actual flexible arm is shown in Fig.3. The arm is made of acrylic and the rigid body is attached to top of the arm. There are some color markers on the beam to measure some states of the arm. Mass of the top of the arm is.145[kg], and length of the arm is.472[m]. Then, we selected six states to control the plant. x is the six dimensional state variable as follows; T x= θ & θ x x& θ & θ (5) ( 1 1 1 1) τ denotes a torque input as a command against the plant. Inputs of the NC are four states and their derivatives of Eq.(5) as shown in Fig.4. We know that these six states are necessary to produce a proper command input for the system. Using these states, the neural network outputs a torque τ. 3.2 Neural network The objective plant has a dynamics. Therefore, simple multi-layered neural network is not appropriate to control the plant, since the multi-layered network cannot have any memories. Such a network cannot handle a dynamic information processing. Thus we used a multi-layered neural network with feedback. The network used here is shown in Fig.4. Basic construction is a simple multi-layered network. However, the network has time-delayed feedback inputs from output of itself. This feedback gives dynamics to the network. 3.3 Control system Fig.5 is schematic flow of signals based on actual equipments. The system consists of the objective flexible arm, CCD camera, image processing device and PC which controls the system and realizes the recurrent neural network. Color markers are attached to the both arm ends to get the arm positions. States of the flexible arm is monitored by the camera. An image processing device Quick-MAG converts positions of color markers equipped on the arm into numerical values. A PC calculates states from these data and outputs torque by the NC realized in the PC. This cycle is repeated. 4. On-line learning by simultaneous perturbation. Fig.6 shows a configuration of the system. The error function is defined as follows; J u( w ) = x d (6) ( ) ( ) 2 2,i i Recurrent Neural Network weights i States x Input Torque torque τ Simultaneous perturbation Flexible arm Desired output Output Displacement x 2 Fig.6 Overall configuration of the system 2585

Pre-training process Modeling Training NC by simulation Download of weights Operation of the objective plant Calculate the error Fig.7 Flowchart On line process Add perturbation to all weights Operation of the objective plant Calculate the error Update all weights Where, d i denotes a desired position of the top of the arm at the i-th sampling time. That is, the error is a sum of the squared error of the position of the end for ten seconds. Every trial gives a value of the error function. Without perturbation, we make a trial and obtain a value of the error in Eq.(6). Next, we add perturbations to all weights simultaneously and make a trial. Then, we have a value of the error, i.e. J ( u ( w+cs )). By using Eq.(3) and (4), we can update all weights in the NC. We repeat this procedure. This is a learning cycle of the NC. This cycle is carried out with control of the flexible arm in every trial. Total flowchart of this learning is shown in Fig.7. In a case, we need pre-training, since using untrained NC is reckless. After the pre-training of the neural network, the network is utilized as a controller of the plant. However, in many preliminary experiments, initial value of zeros for NC yields stable results without pre-training. In control cycle, based on the measurements, a position of the top of the arm, angles θ 1 and θ 2 and their derivative are calculated. These measured data are fed into the NC realized by PC. Torque calculated by PC is sent to the flexible arm through a driver. This process is carried out for every sampling time. The sampling time is mainly restricted by capability of the image processing device. 4.1 Vibration reduction control From a certain initial state, we would like to reduce vibration of the top of the flexible arm. Initial weights of the NC are all zero. The learning coefficient α and the perturbation c are both.2. Fig.8 is a result after 3 times on-line learning by our learning rule. Vibration is reduced, compared with free vibration. The NC controls the actual flexible arm system. 4.2 Tracking control The learning coefficient α and the perturbation c are.2 and.1, respectively. Initial weights of the NC are all zero..2.15.1.5 -.5.25.2.15.1.5 free 3 times learning 2 4 6 8 1 Time [sec] Fig.8 A vibration control of the flexible arm 3 times learning Desired locus 1 times learning 2 4 6 8 1 -.5 Time[sec.] Fig.9 A result of a tracking control. 2586

Error 9 8 7 6 5 4 3 2 1 5 1 15 2 25 3 Iteration Fig.1 Change of error..1.5 2 4 6 8 1 -.5 Before learning Time[sec.] -.1 Desired locus -.15 -.2 -.25 1 times learning Fig.11 A result of a tracking control for random locus. Fig.9 shows results after 1 times and 3 times on-line learning by our learning rule. Desired locus is sinusoidal wave which amplitude is.2[m], period is 1[sec]. As on-line learning proceeds, locus is closing the desired one. Fig.1 shows change of error for sinusoidal locus. We can see that the NC works well as iteration goes. Next, we assigned other locus for the same NC. Weights values learned in the previous example are used as initial weights values of the NC in this example. Target locus changes randomly for every trial. Fig.11 and Fig.12 show results for different locus. Even for different locus, the NC tried to adapt new target locus. Change of error for different target locus is depicted in Fig.13. Since desired locus changes every trial, error changes violently. However, on average, the error decreases as learning goes. Error.25.2.15.1.5 -.5.5.4.3.2.1 1 times learning Desired locus Before learning 2 4 6 8 1 Time[sec.] Fig.12 A result of a tracking control for random locus 2. 5 1 15 2 Iteration Fig.13 Change of error for random locus. 5. Conclusions In this paper, a flexible arm system controlled by a NC using the simultaneous perturbation learning rule is described. Moreover, we could apply this scheme to a real time system. The back-propagation learning rule is not applicable to this without any information on the objective plant. However, by using the simultaneous perturbation learning rule, the NC can learn an inverse of the object plant. Acknowledgement This research is financially supported by Kansai University Frontier Sciences Center and High Technology Research Center. Author would also like to thank to Mr.Kubo for his assistance. References [1]J.C.Spall(1987), A Stochastic approximation technique for generating maximum likelihood 2587

parameter estimates, Proceedings of the 1987 American Control Conference, pp.1161-1167 [2]J.C.Spall(1992), Multivariable stochastic approximation using a simultaneous perturbation gradient approximation, IEEE Trans. Automatic Control, vol.37, pp.332-341 [3]J.Alespector, R.Meir, B.Yuhas, A.Jayakumar and D.Lippe(1993), A parallel gradient descent method for learning in analog VLSI neural networks, in S.J.Hanson, J.D.Cowan and C.Lee(eds.), Advances in neural information processing systems 5(pp.836-844), San Mateo, CA : Morgan Kaufmann Publisher [4]G.Cauwenberghs(1993), A fast stochastic error-descent algorithm for supervised learning and optimization, in S.J.Hanson, J.D.Cowan and C.Lee(eds.), Advances in neural information processing systems 5(pp.244-251), San Mateo, CA : Morgan Kaufmann Publisher [5]Y.Maeda and Y.Kanata(1993), Learning rules for recurrent neural networks using perturbation and their application to neuro-control, Transactions of the Institute of Electrical Engineers of Japan, vol.113-c, pp.42-48 (in Japanese) [6]Y.Maeda, H.Hirano and Y.Kanata(1995), A learning rule of neural networks via simultaneous perturbation and its hardware implementation, Neural Networks, vol.8, pp.251-259 [7]J.C.Spall and J.A.Cristion(1994), Nonlinear adaptive control using neural networks : Estimation with a smoothed form of simultaneous perturbation gradient approximation, Statistica Sinica, vol.4, pp.1-27 [8]J.C.Spall and D.C.Chin(1994), A model-free approach to optimal signal light timing for system-wide traffic control, Proceedings of the 1994 IEEE Conference on Decision and Control, pp.1868-1875 [9]Y.Maeda and R.J.P.deFigueiredo(1997), Learning Rules for Neuro-Controller Via Simultaneous Perturbation, IEEE Trans. on Neural Networks, vol.8, no.5, pp.1119-113 2588