Machine Learning Othello Project

Similar documents
MyPawns OppPawns MyKings OppKings MyThreatened OppThreatened MyWins OppWins Draws

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

CPS331 Lecture: Search in Games last revised 2/16/10

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Monte Carlo tree search techniques in the game of Kriegspiel

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Foundations of Artificial Intelligence

The Grandmaster s Positional Understanding Lesson 1: Positional Understanding

Foundations of Artificial Intelligence

Creating a Dominion AI Using Genetic Algorithms

AI Approaches to Ultimate Tic-Tac-Toe

Artificial Intelligence. Minimax and alpha-beta pruning

Lesson 1: The Rules of Pentago

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

An End Game in West Valley City, Utah (at the Harman Chess Club)

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Search Depth. 8. Search Depth. Investing. Investing in Search. Jonathan Schaeffer

Andrei Behel AC-43И 1

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Game-playing: DeepBlue and AlphaGo

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

2. Review of Pawns p

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

Comp 3211 Final Project - Poker AI

Creating a Poker Playing Program Using Evolutionary Computation

CS 188: Artificial Intelligence

Adversarial Search: Game Playing. Reading: Chapter

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Adversarial Search. CMPSCI 383 September 29, 2011

Students use absolute value to determine distance between integers on the coordinate plane in order to find side lengths of polygons.

Classwork Example 1: Exploring Subtraction with the Integer Game

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger. Project #3: Checkers

HW4: The Game of Pig Due date: Thursday, Oct. 29 th at 9pm. Late turn-in deadline is Tuesday, Nov. 3 rd at 9pm.

Playing Othello Using Monte Carlo

Monte Carlo based battleship agent

A Quoridor-playing Agent

1 In the Beginning the Numbers

Chess Handbook: Course One

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Artificial Intelligence Adversarial Search

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

A Simple Pawn End Game

CS 771 Artificial Intelligence. Adversarial Search

Learning to Play Love Letter with Deep Reinforcement Learning

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

COMP219: Artificial Intelligence. Lecture 13: Game Playing

37 Game Theory. Bebe b1 b2 b3. a Abe a a A Two-Person Zero-Sum Game

Monte Carlo Tree Search

Advanced Players Newsletter

Diet customarily implies a deliberate selection of food and/or the sum of food, consumed to control body weight.


Mastering Chess and Shogi by Self- Play with a General Reinforcement Learning Algorithm

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

OCTAGON 5 IN 1 GAME SET

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Queen vs 3 minor pieces

UNIT 13A AI: Games & Search Strategies. Announcements

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

Decision Making in Multiplayer Environments Application in Backgammon Variants

a b c d e f g h 1 a b c d e f g h C A B B A C C X X C C X X C C A B B A C Diagram 1-2 Square names

Player Profiling in Texas Holdem

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

The study of probability is concerned with the likelihood of events occurring. Many situations can be analyzed using a simplified model of probability

CS 4700: Foundations of Artificial Intelligence

Recovering highlight detail in over exposed NEF images

Discrete Structures for Computer Science

Homework Assignment #2

Essential Chess Basics (Updated Version) provided by Chessolutions.com

Eleventh Annual Ohio Wesleyan University Programming Contest April 1, 2017 Rules: 1. There are six questions to be completed in four hours. 2.

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

LEARN TO PLAY CHESS CONTENTS 1 INTRODUCTION. Terry Marris December 2004

How to Become Master Rated in One Year or Less.

GICAA Chess Coach and Referee Summaries

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

HW4: The Game of Pig Due date: Tuesday, Mar 15 th at 9pm. Late turn-in deadline is Thursday, Mar 17th at 9pm.

Othello/Reversi using Game Theory techniques Parth Parekh Urjit Singh Bhatia Kushal Sukthankar

Algebra Success. LESSON 16: Graphing Lines in Standard Form. [OBJECTIVE] The student will graph lines described by equations in standard form.

Tree depth influence in Genetic Programming for generation of competitive agents for RTS games

CPS331 Lecture: Genetic Algorithms last revised October 28, 2016

Game Playing. Philipp Koehn. 29 September 2015

TURNING ADVANTAGE INTO VICTORY IN CHESS: ALGEBRAIC NOTATION (MCKAY CHESS LIBRARY) BY ANDREW SOLTIS

Chapter 1: Positional Play

GICAA State Chess Tournament

Technology Landscape Report FLEXIBLE DISPLAY Wisdomain, Inc.

Intuition Mini-Max 2

A1 Problem Statement Unit Pricing

Artificial Intelligence Lecture 3

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

ALL YOU SHOULD KNOW ABOUT REVOKES

CS 188: Artificial Intelligence. Overview

-opoly cash simulation

2359 (i.e. 11:59:00 pm) on 4/16/18 via Blackboard

Adversarial Search (Game Playing)

An Adaptive-Learning Analysis of the Dice Game Hog Rounds

3. Bishops b. The main objective of this lesson is to teach the rules of movement for the bishops.

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

Transcription:

Machine Learning Othello Project Tom Barry The assignment. We have been provided with a genetic programming framework written in Java and an intelligent Othello player( EDGAR ) as well a random player. The initial framework has 13 primitives including the simple operations of addition subtraction multiplication and division in addition to numeric data about board positions. This provides the basic tools to evolve board evaluation function to be used to create Othello players. One of the goals of the assignment is to Compare, contrast and discover methods to approach Othello with GP. Motivation for my experiment. I was not familiar with Othello prior to this exercise. So I initially invested several hours being soundly defeated by EDGAR, the AI player provided with the assignment. A recurring theme in these games was my lack of alternatives towards the end of the game. This reminded me of a chess concept of zugzwang or forced moved. Although zugzwang can occur in the middle or end game it is most often associated with king and pawn endgames. The main thrust is that while a player s position is acceptable as it is any move he makes significantly diminishes his position. The other observation I had and that was reinforced in the Evans/Schiffman paper is that the endgame of Othello is the more important than the opening. Evans/Schiffman used a player which was random for several moves and then began to use EDGAR. They found that EDGAR was often strong enough to compensate for the weak start. The experiment. In order to explore the issues above I setup generated populations using 3 sets of primitives. Base Case. These are the primitives provided with the assignment. They include the operators "+,-,*,/" for addition, subtraction, multiplication, and division, respectively. They also include the terminals: "white, black, white_edges, black_edges, white_corners, black_corners, white_near_corners, black_near_corners, and the integer ". Case 1. In addition to the Base Case primitives two terminals, black_availablemoves,white_available_moves, were added. These terminals indicate the number of legal moves available to each color. It was hoped that this would permit the evaluation function to take zugzwang into account Case 2. In addition to the primitives in Case 1 the move_number terminal was added. This was calculated as the number of pieces on the board 4. This was created to provide a tool to measure the passage of time.

A population size of was used and run for generations. The probability of breeding was %. This rate was selected so that half of the existing population would be carried into next generation. The populations were trained against EDGAR. The lower the fitness score the better. The fitness score is equal to the number of the opponents pieces remaining at the end of the game. A fitness measure below 32 implies the new player won the game. A graph of the results is below. Mean Fitness Scores 45 35 25 BaseCase Case 1 Case 2 15 5 1 5 9 13 17 21 25 29 It is worth commenting that the dramatic improvement in Base Case mean fitness was also accompanied a dramatic decline in diversity. In other words a few successful players have become dominant. In order to see if the results were generalizable seven of the best unique players from population,5,15and or 28 from each case. Unique, for this purpose, was defined as having one of the variables fitness, length or depth not match between two players.they each then played 25 games against a random player. The results were surprisingly bimodal the players won or lost more than of the 25 games. The fitness results from these tests are on an equivalent basis with the EDGAR results. EDGAR Fitness Random Number of Players winning more than Random Games Base Case 16.1 27.8 19 Case 1 18.3 32.3 13 Case 2.4 33.7 9 There are several interesting observations.

Beating EDGAR did not assure victory against a random player. Many EDGAR savants were created. Better performance against EDGAR did indicate a higher probability of defeating the random player. In the Base Case 19 of the 28 players defeated the random player only 9 of the Case 2 players did as well. The additional primitives seemed to diminish performance against the random player. In order to achieve some insights into the broader population I examined the th generation of each of the 3 cases. Each of the members of that generation played 5 games against the random player. Results were similar across the three cases so I have selected the Base Case for these graphs. Fitness 7 6 Base Case 6 Edgar The fitness measure for EDGAR is on the x axis. Since this was the fitness measure used for training you can see a reasonable amount of concentration. No effort was made to eliminate duplicates. As you can see from the vertical lines players with the same capability against EDGAR varied performance against the random players. Those players in the area bounded by 32 on both axis are the ones who beat both players.

Random Fitness vs Length 9 8 7 6 BaseCase 6 7 Random Fitness The above graph compares the fitness against the random player(x-axis) and the length of the player. Although there is not a substantial bias it does appear that a longer player is more likely to defeat the random player. This is indicated by the higher density in the upper left quadrant as compared to the upper right. But there does not appear to be a preference for short string over long. EDGAR Fitness vs Lengtth 9 8 7 6 Base Case 6 Edgar Fitness As you would expect EDGAR is a more difficult competitor and so there is a certain scarcity at the left hand portion of the graph. It does not appear that there is any relationship between the length of the string and the likelihood of success against EDGAR.

Conclusion It was somewhat disappointing that the additional primitives added no apparent value. It was, however, striking that players which could defeat EDGAR would fair so badly against a random player. The cautionary lesson to be learned is that if you wish to achieve generalization you must make sure that your training technique is varied. But there is good news as well. If faced with a complicated but very specific problem not requiring generalization genetic algorithms may be a very effective approach even without extended diverse training