Intelligent Non-Player Character with Deep Learning Meng Zhixiang, Zhang Haoze Supervised by Prof. Michael Lyu CUHK CSE FYP Term 1 Intelligent Non-Player Character with Deep Learning 1
Intelligent Non-Player Character with Deep Learning 2
Background We all know the results Intelligent Non-Player Character with Deep Learning 3
Agenda o Background o Motivation & Objective o Methodologies o Design & Implementation o Results & Discussion o Conclusion Intelligent Non-Player Character with Deep Learning 4
Agenda o Background o Development of AI in Go, Chess and Chinese Chess o Difference among Go, Chess and Chinese Chess o Motivation & Objective o Methodologies o Design & Implementation o Results & Discussion o Conclusion Intelligent Non-Player Character with Deep Learning 5
Development of AI in Go No Good Results Zen beat Takemiya Masaki at five stones handicap Mar 2012 AlphaGo beat Lee Sedol Mar 2016 Minimax Searching Pruning Monte Carlo Deep Learning Intelligent Non-Player Character with Deep Learning 6
Difference between Go and Chess/Chinese Chess Intelligent Non-Player Character with Deep Learning 7
Development of AI in Chess Deep Blue beat Garry Kasparov May 1997 Stockfish won TCEC 2013, 2014, 2015 Giraffe plays at the level of an FIDE International Master on a PC Sep 2015 Minimax Searching Evaluation Function Hand-Coded Knowledge Minimax Searching Evaluation Function Hand-Coded Knowledge Deep Reinforcement Learning TCEC: Top Chess Engine Championship FIDE: World Chess Federation Intelligent Non-Player Character with Deep Learning 8
Difference between Chess and Chinese Chess Intelligent Non-Player Character with Deep Learning 9
Development of AI in Chinese Chess Tiansuo Inspur System beat five Grandmaster players Aug 2006 Chess Nade beat three Master players Nov 2009 Now??? Minimax Searching Alpha-Beta Pruning Hand-Coded Knowledge Minimax Searching Alpha-Beta Pruning Hand-Coded Knowledge Deep Learning??? Intelligent Non-Player Character with Deep Learning 10
Motivation Intelligent Non-Player Character with Deep Learning 11
Objective Server Human Player User Interface Game AI Intelligent Non-Player Character with Deep Learning 12
Agenda o Background o Motivation & Objective o Methodologies o Supervised Learning o Convolutional Neural Network o Design & Implementation o Results & Discussion o Conclusion Intelligent Non-Player Character with Deep Learning 13
Supervised Learning o Supervised Learning o the right answer is given o Regression Problem & Classification Problem o Unsupervised Learning o no right answer is given o Clustering Problem Intelligent Non-Player Character with Deep Learning 14
Neural Network o Non-linear Hypotheses o Neurons and Brain o Backpropagation Intelligent Non-Player Character with Deep Learning 15
Convolutional Neural Network o Feed-forward o Organization of Animal Visual Cortex o Image Recognition Local Receptive Fields Shared Weights and Biases Intelligent Non-Player Character with Deep Learning 16
Agenda o Background o Motivation & Objective o Methodologies o Design & Implementation o Project Workflow o Results & Discussion o Conclusion Intelligent Non-Player Character with Deep Learning 17
Project Workflow Accuracy Testing Model Design Model Building Model Training Model Testing Real Performance Testing Intelligent Non-Player Character with Deep Learning 18
Design Overview Game AI Policy Network Evaluation Network Predict probabilities of next moves Evaluate winning rate Intelligent Non-Player Character with Deep Learning 19
Game AI Structure Piece Selector Message Receiver Format Converter Feature Extractor Decision Maker Message Sender Move Selector Intelligent Non-Player Character with Deep Learning 20
Feature Channels Feature Channel 1 Feature Channel 2 Feature Channel 3 Feature Channel 4 Feature Channel 5 Feature Channel 6 Feature Channel 7 Feature Channel 8 Feature Channel 9 (only for Move Selector) Pieces belonging to different sides Pieces of Advisor type Pieces of Bishop type Pieces of Cannon type Pieces of King type Pieces of Knight type Pieces of Pawn type Pieces of Rock type Valid moves for the selected piece Intelligent Non-Player Character with Deep Learning 21
Feature Channels Chessboard Status 1 st Feature Channel 4 th Feature Channel 9 th Feature Channel Intelligent Non-Player Character with Deep Learning 22
Piece Selector & Move Selector Intelligent Non-Player Character with Deep Learning 23
Piece Selector & Move Selector Extracted Features First Hidden Convolutional Layer Second Hidden Convolutional Layer Third Hidden Layer (Softmax Layer) Probability Distribution Rectified Linear Unit (ReLU) Intelligent Non-Player Character with Deep Learning 24
Selection Strategy o Strategy 1: o Select the piece with highest possibility given by Piece Selector o Select the destination of that piece with highest possibility given by Move Selector o Strategy 2: o Calculate the probability of moving a piece * the probability of a destination of that piece o Select the combination with highest probability Intelligent Non-Player Character with Deep Learning 25
Project Workflow Accuracy Testing Model Design Model Building Model Training Model Testing Real Performance Testing Intelligent Non-Player Character with Deep Learning 26
TensorFlow o an open source software library o for numerical computation o using data flow graphs o flexibility and portability Intelligent Non-Player Character with Deep Learning 27
Project Workflow Accuracy Testing Model Design Model Building Model Training Model Testing Real Performance Testing Intelligent Non-Player Character with Deep Learning 28
Training Dataset Collected Game Records Features and Targets Training Dataset for Different NN models Intelligent Non-Player Character with Deep Learning 29
FEN Format rnbakab1r/111111111/1c1111nc1/p1p1p1p1p/111111111/111111111/ P1P1P1P1P/1C11C1111/111111111/RNBAKABNR, r Intelligent Non-Player Character with Deep Learning 30
Format Conversion 炮二平五 马二进三 车一进一 车一平六 车六进七 车九进一 炮八进五 炮五进四 车九平六 前车进一 车六平四 车四进六 炮八平五 炮8平5 马8进7 车9平8 车8进6 马2进1 炮2进7 马7退8 士6进5 将5平6 士5退4 炮5平6 将6平5 Intelligent Non-Player Character with Deep Learning 31
Training Strategy o Piece Selector and Move Selector are trained separately o Shuffle the training dataset containing over 1,600,000 moves o Train the models batch by batch o Test the accuracy along the process o An untrained testing dataset containing over 80,000 moves Intelligent Non-Player Character with Deep Learning 32
Project Workflow Accuracy Testing Model Design Model Building Model Training Model Testing Real Performance Testing Intelligent Non-Player Character with Deep Learning 33
Results Piece Selector Accuracy accuracy = # of correct predictions / total # of test cases prediction: the choice with the highest probability Intelligent Non-Player Character with Deep Learning 34
Results Move Selector Accuracy Intelligent Non-Player Character with Deep Learning 35
Results Move Selector Accuracy Advisor 89.8% Bishop 91.2% Cannon 54.1% King 79.8% Knight 70.1% Pawn 90.4% Rock 53.6% Move Selector Accuracy Intelligent Non-Player Character with Deep Learning 36
Results Intelligent Non-Player Character with Deep Learning 37
Results Intelligent Non-Player Character with Deep Learning 38
Results Intelligent Non-Player Character with Deep Learning 39
Results Selection Strategy 1 Selection Strategy 2 Intelligent Non-Player Character with Deep Learning 40
Discussion o Possible Reasons: o CNN not deep enough o Training dataset not large enough o Records in training dataset may not be the optimal choices o For one chessboard status, there may be different move choices in training dataset o It s hard to judge which choice is better in current phase Intelligent Non-Player Character with Deep Learning 41
Conclusion o Achieved overall high accuracy o Performed badly in some cases o Need further improvement o Reinforcement Learning o Not limited by training dataset o Evaluation Network o To judge which move is better Intelligent Non-Player Character with Deep Learning 42
Q&A Intelligent Non-Player Character with Deep Learning 43