Ensemble Evolution of Checkers Players with Knowledge of Opening, Middle and Endgame

Similar documents
A MODIFIED DIFFERENTIAL EVOLUTION ALGORITHM IN SPARSE LINEAR ANTENNA ARRAY SYNTHESIS

Dynamic Optimization. Assignment 1. Sasanka Nagavalli January 29, 2013 Robotics Institute Carnegie Mellon University

Introduction to Coalescent Models. Biostatistics 666 Lecture 4

A NSGA-II algorithm to solve a bi-objective optimization of the redundancy allocation problem for series-parallel systems

Introduction to Coalescent Models. Biostatistics 666

Machine Learning in Production Systems Design Using Genetic Algorithms

Diversion of Constant Crossover Rate DE\BBO to Variable Crossover Rate DE\BBO\L

Research of Dispatching Method in Elevator Group Control System Based on Fuzzy Neural Network. Yufeng Dai a, Yun Du b

High Speed, Low Power And Area Efficient Carry-Select Adder

Queen Bee genetic optimization of an heuristic based fuzzy control scheme for a mobile robot 1

ANNUAL OF NAVIGATION 11/2006

Performance Analysis of Multi User MIMO System with Block-Diagonalization Precoding Scheme

PRACTICAL, COMPUTATION EFFICIENT HIGH-ORDER NEURAL NETWORK FOR ROTATION AND SHIFT INVARIANT PATTERN RECOGNITION. Evgeny Artyomov and Orly Yadid-Pecht

Learning Ensembles of Convolutional Neural Networks

MODEL ORDER REDUCTION AND CONTROLLER DESIGN OF DISCRETE SYSTEM EMPLOYING REAL CODED GENETIC ALGORITHM J. S. Yadav, N. P. Patidar, J.

Fall 2018 #11 Games and Nimbers. A. Game. 0.5 seconds, 64 megabytes

NETWORK 2001 Transportation Planning Under Multiple Objectives

To: Professor Avitabile Date: February 4, 2003 From: Mechanical Student Subject: Experiment #1 Numerical Methods Using Excel

Multi-focus Image Fusion Using Spatial Frequency and Genetic Algorithm

Calculation of the received voltage due to the radiation from multiple co-frequency sources

Review: Our Approach 2. CSC310 Information Theory

Multiple Robots Formation A Multiobjctive Evolution Approach

Optimal Placement of PMU and RTU by Hybrid Genetic Algorithm and Simulated Annealing for Multiarea Power System State Estimation

Evolutionary Programming for Reactive Power Planning Using FACTS Devices

Finding Proper Configurations for Modular Robots by Using Genetic Algorithm on Different Terrains

Comparative Analysis of Reuse 1 and 3 in Cellular Network Based On SIR Distribution and Rate

Coverage Maximization in Mobile Wireless Sensor Networks Utilizing Immune Node Deployment Algorithm

Fast Code Detection Using High Speed Time Delay Neural Networks

A novel immune genetic algorithm based on quasi-secondary response

Evolving Speciated Checkers Players with Crowding Algorithm

Uncertainty in measurements of power and energy on power networks

NEW EVOLUTIONARY PARTICLE SWARM ALGORITHM (EPSO) APPLIED TO VOLTAGE/VAR CONTROL

A Study of Detector Generation Algorithms Based on Artificial Immune in Intrusion Detection System

Investigation of Hybrid Particle Swarm Optimization Methods for Solving Transient-Stability Constrained Optimal Power Flow Problems

Control Chart. Control Chart - history. Process in control. Developed in 1920 s. By Dr. Walter A. Shewhart

Key-Words: - Automatic guided vehicles, Robot navigation, genetic algorithms, potential fields

Research on the Process-level Production Scheduling Optimization Based on the Manufacturing Process Simplifies

Adaptive Phase Synchronisation Algorithm for Collaborative Beamforming in Wireless Sensor Networks

Open Access Node Localization Method for Wireless Sensor Networks Based on Hybrid Optimization of Differential Evolution and Particle Swarm Algorithm

Hybrid Differential Evolution based Concurrent Relay-PID Control for Motor Position Servo Systems

A New Type of Weighted DV-Hop Algorithm Based on Correction Factor in WSNs

Rejection of PSK Interference in DS-SS/PSK System Using Adaptive Transversal Filter with Conditional Response Recalculation

Beam quality measurements with Shack-Hartmann wavefront sensor and M2-sensor: comparison of two methods

Inverse Halftoning Method Using Pattern Substitution Based Data Hiding Scheme

problems palette of David Rock and Mary K. Porter 6. A local musician comes to your school to give a performance

IEE Electronics Letters, vol 34, no 17, August 1998, pp ESTIMATING STARTING POINT OF CONDUCTION OF CMOS GATES

Performance Analysis of the Weighted Window CFAR Algorithms

Intelligent and Robust Genetic Algorithm Based Classifier

Application of Intelligent Voltage Control System to Korean Power Systems

TECHNICAL NOTE TERMINATION FOR POINT- TO-POINT SYSTEMS TN TERMINATON FOR POINT-TO-POINT SYSTEMS. Zo = L C. ω - angular frequency = 2πf

NATIONAL RADIO ASTRONOMY OBSERVATORY Green Bank, West Virginia SPECTRAL PROCESSOR MEMO NO. 25. MEMORANDUM February 13, 1985

Development of Neural Networks for Noise Reduction

ROBUST IDENTIFICATION AND PREDICTION USING WILCOXON NORM AND PARTICLE SWARM OPTIMIZATION

Efficient Large Integers Arithmetic by Adopting Squaring and Complement Recoding Techniques

Optimal Sizing and Allocation of Residential Photovoltaic Panels in a Distribution Network for Ancillary Services Application

Figure.1. Basic model of an impedance source converter JCHPS Special Issue 12: August Page 13

Low Switching Frequency Active Harmonic Elimination in Multilevel Converters with Unequal DC Voltages

Adaptive Modulation for Multiple Antenna Channels

Instruction Sheet AMPMODU* MTE CONNECTORS Mar 11 Rev A

The Performance Improvement of BASK System for Giga-Bit MODEM Using the Fuzzy System

Achieving Efficient and Cognitively Plausible Learning in Backgammon

Performance Enhancement in Machine Learning System using Hybrid Bee Colony based Neural Network

A High-Sensitivity Oversampling Digital Signal Detection Technique for CMOS Image Sensors Using Non-destructive Intermediate High-Speed Readout Mode

Particle Filters. Ioannis Rekleitis

FEATURE SELECTION FOR SMALL-SIGNAL STABILITY ASSESSMENT

Genetic Algorithm Based Deep Learning Parameters Tuning for Robot Object Recognition and Grasping

A PARTICLE SWARM OPTIMIZATION FOR REACTIVE POWER AND VOLTAGE CONTROL CONSIDERING VOLTAGE SECURITY ASSESSMENT

Graph Method for Solving Switched Capacitors Circuits

A study of turbo codes for multilevel modulations in Gaussian and mobile channels

Priority based Dynamic Multiple Robot Path Planning

Revision of Lecture Twenty-One

Localization of FACTS Devices for Optimal Power Flow Using Genetic Algorithm

Multiobjective Metaheuristics Optimization in Reactive Power Compensation

Adaptive Group Organization Cooperative Evolutionary Algorithm for TSK-type Neural Fuzzy Networks Design

Side-Match Vector Quantizers Using Neural Network Based Variance Predictor for Image Coding

Comparative Study of Short-term Electric Load Forecasting

Dynamic Lightpath Protection in WDM Mesh Networks under Wavelength Continuity Constraint

NOVEL ITERATIVE TECHNIQUES FOR RADAR TARGET DISCRIMINATION

An Algorithm Forecasting Time Series Using Wavelet

Evaluate the Effective of Annular Aperture on the OTF for Fractal Optical Modulator

An Optimal Model and Solution of Deployment of Airships for High Altitude Platforms

Forecasting Stock Returns using Evolutionary Artificial Neural Networks 1

Optimization of Shortest Path of Multiple Transportation Model Based on Cost Analyses

Global transformer design optimization using deterministic and non-deterministic algorithms

Medical Diagnosis using Incremental Evolution of Neural Network

Genetic Algorithm for Sensor Scheduling with Adjustable Sensing Range

Throughput Maximization by Adaptive Threshold Adjustment for AMC Systems

A Comparison of Two Equivalent Real Formulations for Complex-Valued Linear Systems Part 2: Results

Ad hoc Service Grid A Self-Organizing Infrastructure for Mobile Commerce

COGNITIVE RADIO ENGINE MODEL UTILIZING SOFT FUSION BASED GENETIC ALGORITHM FOR COOPERATIVE SPECTRUM OPTIMIZATION

New Applied Methods For Optimum GPS Satellite Selection

IEEE TRANSACTIONS ON CYBERNETICS 1. Improving Metaheuristic Algorithms With Information Feedback Models

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

A Parallel Task Scheduling Optimization Algorithm Based on Clonal Operator in Green Cloud Computing

I' I THE GAME OF CHECKERS SOME STUDIES IN MACHINE LEARNING USING. should eventually eliminate the need for much of this detailed programming

Time-frequency Analysis Based State Diagnosis of Transformers Windings under the Short-Circuit Shock

DESIGN OF OPTIMIZED FIXED-POINT WCDMA RECEIVER

Optimal Allocation of Static VAr Compensator for Active Power Loss Reduction by Different Decision Variables

A Hybrid Ant Colony Optimization Algorithm or Path Planning of Robot in Dynamic Environment

A Novel GNSS Weak Signal Acquisition Using Wavelet Denoising Method

Transcription:

Ensemble Evoluton of Checkers Players wth Knowledge of Openng, Mddle and Endgame Kyung-Joong Km and Sung-Bae Cho Department of Computer Scence, Yonse Unversty 134 Shnchon-dong, Sudaemoon-ku, Seoul 120-749 South Korea {kjkm, sbcho}@cs.yonse.ac.kr Abstract. In ths paper, we argue that the nserton of doman knowledge nto ensemble of dverse evolutonary checkers can produce mproved strateges and reduce evoluton tme by restrctng search space. The evolutonary approach for game s dfferent from the tradtonal one that explots knowledge of the openng, mddle, and endgame stages, so that t s not sometmes effcent to evolve smple heurstc that s found easly by humans because t s based purely on a bottom-up style of constructon. In ths paper, we have proposed the systematc nserton of openng knowledge and an endgame database nto the framework of evolutonary checkers. Also, common knowledge, the combnaton of dverse strateges s better than the sngle best one, s nserted nto the mddle stage and s mplemented usng crowdng algorthm and a strategy combnaton scheme. Expermental results show that the proposed method s promsng for generatng better strateges. 1 Introducton Incorporatng a pror knowledge, such as expert knowledge, meta-heurstcs, human preferences, and most mportantly doman knowledge dscovered durng evolutonary search, nto evolutonary algorthms has ganed ncreasng nterest n recent years [1]. In ths paper, we propose a method for systematcally nsertng expert knowledge nto an evolutonary checkers framework at the openng, mddle, and endgame stages. In the openng stage, openngs defned by the Amercan Checkers Federaton (ACF) are used. In prevous work, we have used specaton technques to search for dverse strateges that embody dfferent styles of game play and have combned them usng votng for hgher performance [2]. Ths dea comes from the common knowledge that the combnaton of dverse well-playng strateges can defeat the best one because they can complement each other for hgher performance. Fnally, we have used an endgame database from Chnook, the frst man-machne checkers champon. Fgure 1 explans the conceptual framework of the proposed method. The most mportant dea s the systematcal ntegraton of three doman knowledge (openng DB, mddle stage knowledge and endgame DB). The mddle stage knowledge s comng from the Korean event n the game of Go. In 2003, Internet ste TYGEM (http://www.tygem.co.kr) held a many-to-one style game between Hoon Hyun Cho, one of the greatest go players, and 3000 amateur players. The wnner of Q. Yang and G. Webb (Eds.): PRICAI 2006, LNAI 4099, pp. 950 954, 2006. Sprnger-Verlag Berln Hedelberg 2006

Ensemble Evoluton of Checkers Players 951 the game was Cho. After the game, he sad that t was a very dffcult game because there was no obvous mstake of amateur players. Specaton algorthm for evolutonary checkers s adopted for an mplementaton of the knowledge. Openng DB Generatng Neural Network Populaton Genetc Operaton Two Smlar Neural Networks Game Organzer Generatng Game Tree Board Evaluaton of Leaf Nodes Usng NN Next Generaton Applyng Crowdng Algorthm Wnner Neural Network Endgame DB Decson of Next Move Mn-Max Search Specated neural network evoluton Checkers game playng Fg. 1. Conceptual dagram of the proposed method 2 Incorporatng Knowledge nto Evolutonary Checkers 2.1 Openng Stage The openng move s the most mportant opportunty to defeat an expert player because trval mstakes n the openng can lead to an early loss. The frst move n checkers s played by red and there are seven choces (9-13, 9-14, 10-14, 10-15, 11-15, 11-16, and 12-16). Usually, 11-15 s the best move for red but there are many other alternatves. They are descrbed wth specfc names, such as Ednburgh, Double Corner, Denny, Kelso, Old Fathful, Brstol, and Dundee, respectvely. For each choce, there are many well establshed more sequences whch range n length from 2 to 10. The longest sequence s descrbed as the Whte Doctor: 11-16, 22-18, 10-14, 25-22, 8-11, 24-20, 16-19, 23-16, 14-23, 26-19. Careful analyss over decades of tournament play has proven the usefulness or farness of the openng sequences. Intal sequences are decded by the openng book untl the move s out of the book. Each player chooses ther openng randomly and the seven frst choces have the same probablty to be selected as an openng. 2.2 Evolutonary Specated Checkers Followng Fogel [3], a checkers board s represented by a vector of length 32 and components n the vector could have a value of {-K, -1, 0, +1, +K}, where K s the

952 K.-J. Km and S.-B. Cho value assgned for a kng, 1 s the value for a regular checker, and 0 represents an empty square. For reflectng spatal features of the board confguraton, sub-boards of the board are used as an nput. One board can have 36 3 3 sub-boards, 25 4 4 subboards, 16 5 5 sub-boards, 9 6 6 sub-boards, 4 7 7 sub-boards and 1 8 8 sub-board. 91 sub-boards are used as an nput to the feed-forward neural network. The sgn of the value ndcates whether or not the pece belongs to the player or the opponent. The closer the output of the network s to 1.0, the better the poston s. Smlarly, the closer the output s to -1.0, the worse the board. The archtecture of the network s fxed and only the weghts can be adjusted by evoluton. Each ndvdual n the populaton represents a neural network (weghts and bases) that s used to evaluate the qualty of the board confguraton. Addtonally, each neural network has the value of K and self-adaptve parameters for weghts and bases. An offsprng P ', = 1,..., p for each parent P, = 1.,,, p s created by σ '( j ) = σ ( j)exp( τn (0,1)), j = 1,..., N w '( j) = w ( j) + σ '( j) N (0,1), j = 1,..., N where N W s the number of weghts and bases n the neural network (here ths s 5046), τ = 1 / 2 Nw = 0. 0839, and N j (0,1) s the standard Gaussan random varable resampled for every j. In ftness evaluaton, each ndvdual chooses fve opponents from a populaton pool and plays games wth the players. Ftness ncreases by 1 for a wn whle the ftness of an opponent decreases by 2 for a loss. In a draw, the ftness values of both players reman the same. After all the games are played, the ftness values of all players are determned. In ths paper, we utlze a crowdng algorthm [4], a popular form of specaton algorthm, for searchng for dverse neural networks. In ths algorthm, one neural network s selected from two smlar ndvduals based on the result of game played between them (usually, a crowdng algorthm uses ther ftness but n ths case, we cannot use ftness because of the dynamc property of ftness landscape). A crowdng algorthm s one of the representatve specaton methods that attempt to dscover dverse speces n a search space. The dstance between two neural networks s calculated by usng Eucldean dstance between ther weghts. To dscover clusters of ndvduals n the populaton at the last generaton wth arbtrary shape, densty-based clusterng methods have been used. DBSCAN (Densty-based Spatal Clusterng of Applcatons wth Nose) s one of the algorthms [5]. Representatve players from each cluster are chosen by tournament of all players n the same cluster. Moves of combned players are determned usng a smple votng of the representatve players. It pcks the move that has the greatest number of votes. If there s no clear wnner, one of the moves that have the greatest votes s selected randomly. 2.3 Endgame Stage The estmated qualty of the board s calculated usng the evolved neural networks to evaluate the leaf nodes of the tree wth mn-max algorthm. If the value of f (estmated goodness of the next moves) s not relable, we refer to the doman specfc knowledge and revse f. The decson rule for queryng the doman knowledge must j j w w

Ensemble Evoluton of Checkers Players 953 be defned prevously as follows. IF (f<0.75 and f>0.25) or (f<-0.25 and f>-0.75) THEN queryng the doman knowledge. 3 Expermental Results The non-specated evolutonary algorthm uses a populaton sze of 15 and lmts the run to 60 generatons. The specated evolutonary algorthm sets the populaton sze to 15 and generatons to 60. The mutaton rate s 0.01 and crossover rate s 1.0. The number of leagues (t s used to select the best player from each speces) s 5 (5 means that each player selects 5 players from speces randomly and the competton results are used for the selecton). Tme (hours) Wth Knowledge 50 Wthout Knowledge 40 30 20 10 0 1 11 21 31 41 51 Generaton Fg. 2. Comparson of runnng tme (Smple evoluton) The Chnook endgame DB (2~6 peces) s used for revson when the estmated value from the neural network s between 0.25 and 0.75 or between -0.25 and -0.75. Tme analyss ndcates that the evoluton wth knowledge takes much less tme than that wthout knowledge n smple evoluton (Fgure 2). Ths means that the nserton of knowledge wthn a lmted scope can accelerate the speed of evolutonary algorthm because t can reduce computatonal requrement for fndng optmal endgame sequence by usng endgame DB. Table 1 summarzes the competton results between the best ndvdual n the evoluton wth knowledge and the best ndvdual n the evoluton wthout knowledge for each generaton. The knowledge ncorporaton model can perform better than the one wthout knowledge. Table 2 shows the competton results n the specated evoluton. Table 3 shows the effect of the stored knowledge (openng and endgame DB) n specaton. Table 1. Expermental results on openng and endgame knowledge ncorporaton (Wn/ Lose/Draw) for smple evoluton. Evoluton wth the stored knowledge performs better than that wthout the knowledge. (Op=Openng knowledge, SGA=Smple GA, E=Endgame knowledge). Op+SGA+E SGA Generatons 1~14 15~29 30~44 45~59 Total Red Whte 5/0/10 3/3/9 3/0/12 5/3/7 16/6/38 Whte Red 4/3/8 4/2/9 5/4/6 4/2/9 17/11/32

954 K.-J. Km and S.-B. Cho Table 2. Expermental results on openng and endgame knowledge ncorporaton (Wn/Lose/ Draw) for specated evoluton. Evoluton wth the stored knowledge performs better than that wthout the knowledge. (S=Specaton). Op+S+E Specated Generatons 1~14 15~29 30~44 45~59 Total Red Whte 5/1/9 4/3/8 6/0/9 8/2/5 23/6/31 Whte Red 7/3/5 5/2/8 8/4/3 6/2/7 26/11/23 Table 3. The competton results between the specated players usng both openng and endgame DB and the specated player wth one of the knowledge Op+S+E Op+S Total Red Whte 6/2/7 Whte Red 8/4/3 Op+S+E S+E Total Red Whte 5/5/5 Whte Red 4/5/6 Op+S S+E Total Red Whte 3/6/6 Whte Red 2/7/6 4 Concluson and Future Work The fnal concluson of the experment s SGA < Specated < Op+S < S+E Op+S+E (SGA < Specated s from the results of [2]). The effect of openng knowledge s not so bg because they have only the lmted sequences. The lmted openng knowledge can prevent the player from makng a bg mstake but t s not much useful when the opponent chooses a move that s not ncluded n the openng sequence. Multple dverse neural networks can perform better than the sngle best one but there s always problem of combnaton and averagng may not work. As a future work, sophstcated combnaton method should be explored for better performance. References 1. Jn, Y.: Knowledge Incorporaton n Evolutonary Computaton, Sprnger (2004) 2. Km, K.-J. and Cho, S.-B.: Evolvng specated checkers players wth crowdng algorthm. Proc. of the 2002 Congress on Evolutonary Computaton (2002) 407-412 3. Fogel, D.B.: Blonde24: Playng at the Edge of AI. Morgan Kaufmann (2001) 4. Mahfoud, S. W.: Nchng methods. Handbook of Evolutonary Computaton, C6.1, IOP Publshng and Oxford Unversty Press, (1997) 5. Ester, M., Kregel, H.-P., Sander J. and Xu, X.: A densty-based algorthm for dscoverng clusters n large spatal databases wth nose. Knowledge Dscovery and Data Mnng (1996) 226-231