CSE 473 Midterm Exam Feb 8, 2018

Similar documents
CS188: Artificial Intelligence, Fall 2011 Written 2: Games and MDP s

Section Marks Agents / 8. Search / 10. Games / 13. Logic / 15. Total / 46

Midterm Examination. CSCI 561: Artificial Intelligence

CS 188 Fall Introduction to Artificial Intelligence Midterm 1

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

Q1. [11 pts] Foodie Pacman

UNIVERSITY of PENNSYLVANIA CIS 391/521: Fundamentals of AI Midterm 1, Spring 2010

CS 188 Introduction to Fall 2014 Artificial Intelligence Midterm

UMBC CMSC 671 Midterm Exam 22 October 2012

Name: Your EdX Login: SID: Name of person to left: Exam Room: Name of person to right: Primary TA:

UMBC 671 Midterm Exam 19 October 2009

CS188 Spring 2011 Written 2: Minimax, Expectimax, MDPs

Introduction to Spring 2009 Artificial Intelligence Final Exam

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Problem 1. (15 points) Consider the so-called Cryptarithmetic problem shown below.

CS 171, Intro to A.I. Midterm Exam Fall Quarter, 2016

CS188 Spring 2010 Section 3: Game Trees

CS188 Spring 2010 Section 3: Game Trees

ARTIFICIAL INTELLIGENCE (CS 370D)

Midterm. CS440, Fall 2003

Your Name and ID. (a) ( 3 points) Breadth First Search is complete even if zero step-costs are allowed.

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

15-381: Artificial Intelligence Assignment 3: Midterm Review

CS 5522: Artificial Intelligence II

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

2359 (i.e. 11:59:00 pm) on 4/16/18 via Blackboard

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

CS188 Spring 2014 Section 3: Games

: Principles of Automated Reasoning and Decision Making Midterm

Adversarial Search 1

Game Playing State-of-the-Art

Written examination TIN175/DIT411, Introduction to Artificial Intelligence

CS 540: Introduction to Artificial Intelligence

Programming Project 1: Pacman (Due )

Foundations of Artificial Intelligence

6.034 Quiz 2 20 October 2010

Artificial Intelligence

CS 188: Artificial Intelligence

CS 771 Artificial Intelligence. Adversarial Search

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

CS 229 Final Project: Using Reinforcement Learning to Play Othello

6.034 Quiz September Jake Barnwell Michaela Ennis Rebecca Kekelishvili. Vinny Chakradhar Phil Ferguson Nathan Landman

More on games (Ch )

mywbut.com Two agent games : alpha beta pruning

Practice Session 2. HW 1 Review

CS510 \ Lecture Ariel Stolerman

CS-171, Intro to A.I. Mid-term Exam Winter Quarter, 2015

CS 188: Artificial Intelligence

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

CSE 573: Artificial Intelligence Autumn 2010

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Artificial Intelligence Lecture 3

Search then involves moving from state-to-state in the problem space to find a goal (or to terminate without finding a goal).

Multiple Agents. Why can t we all just get along? (Rodney King)

Artificial Intelligence Search III

Artificial Intelligence

Adversary Search. Ref: Chapter 5

Game-Playing & Adversarial Search

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Artificial Intelligence Ph.D. Qualifier Study Guide [Rev. 6/18/2014]

AI Approaches to Ultimate Tic-Tac-Toe

Artificial Intelligence. Minimax and alpha-beta pruning

CS 188: Artificial Intelligence Spring Announcements

CS 4700: Foundations of Artificial Intelligence

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

6.034 Quiz September 2018

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Solving Problems by Searching

Artificial Intelligence

AI Module 23 Other Refinements

Artificial Intelligence Adversarial Search

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

Generalized Game Trees

Games we will consider. CS 331: Artificial Intelligence Adversarial Search. What makes games hard? Formal Definition of a Game.

2 person perfect information

5.4 Imperfect, Real-Time Decisions

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

22c:145 Artificial Intelligence

1. Compare between monotonic and commutative production system. 2. What is uninformed (or blind) search and how does it differ from informed (or

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

6.034 Quiz 1 October 13, 2005

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Project 1. Out of 20 points. Only 30% of final grade 5-6 projects in total. Extra day: 10%

Homework Assignment #2

More on games (Ch )

CS 188: Artificial Intelligence. Overview

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

AIMA 3.5. Smarter Search. David Cline

Game-playing: DeepBlue and AlphaGo

CSE 40171: Artificial Intelligence. Adversarial Search: Game Trees, Alpha-Beta Pruning; Imperfect Decisions

Adversarial Search Lecture 7

Opponent Models and Knowledge Symmetry in Game-Tree Search

CS325 Artificial Intelligence Ch. 5, Games!

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

CS 540-2: Introduction to Artificial Intelligence Homework Assignment #2. Assigned: Monday, February 6 Due: Saturday, February 18

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CS 4700: Artificial Intelligence

Transcription:

CSE 473 Midterm Exam Feb 8, 2018 Name: This exam is take home and is due on Wed Feb 14 at 1:30 pm. You can submit it online (see the message board for instructions) or hand it in at the beginning of class. This exam should not take significantly longer than 3 hours to complete if you have already carefully studied all of course material. Studying while taking the exam may take longer. :) This exam is open book and open notes, but you must complete all of the work yourself with no help from others. Please feel free to post clarification questions to the class message board, but please do not discuss solutions there. Partial Credit: If you show your work and *briefly* describe your approach to the longer questions, we will happily give partially credit, where possible. We reserve the right to take off points for overly long answers. Please do not just write everything you can think of for each problem. Name: Please do not forget to write your name in the space above! 1

Question 1 True/False 30 points Circle the correct answer each True / False question. 1. True / False Reflex agents cannot act optimally (in terms of maximizing total expected reward over time). (3 pt) 2. True / False Minimax is optimal against perfect opponents. (3 pt) 3. True / False Greedy search can take longer to terminate than uniform cost search. (3 pt) 4. True / False Uniform cost search with costs of 1 for all transitions is the same as depth first search. (3 pt) 5. True / False Alpha-Beta pruning can introduce errors during mini-max search. (3 pt) 6. True / False Each state can only appear once in a state graph. (3 pt) 7. True / False Policy Iteration always find the optimal policy, when run to convergence. (3 pt) 8. True / False Higher values for the discount (γ) will, in general, cause value iteration to converge more slowly. (3pt) 9. True / False For MDPs, adapting the policy to depend on the previous state, in addition to the current state, can lead to higher expected reward. (3pt) 10. True / False Graph search can sometimes expand more nodes than tree search. (3pt) 2

Question 2 Short Answer 30 points These short answer questions can be answered with a few sentences each. 1. Short Answer Briefly describe the relationship between admissible and consistent heuristics. When would you use each, and why? (5 pts) 2. Short Answer Briefly describe when you would use Alpha-beta pruning in minimax search. (5 pts) 3. Short Answer For Q-learning, when would you prefer to use linear function approximation and when would you just use the tabular version? Is there ever any drawback to using the linear version? (5 pts) 4. Short Answer Briefly describe the difference between UCS and A* search. When would you prefer to use each, and why? (5 pts) 3

5. Short Answer For Q-learning, briefly describe the conditions needed to ensure convergence. Is it guaranteed for any exploration policy? (5 pts) 6. Short Answer Briefly describe the difference between value iteration and policy iteration. Describe conditions under which one algorithm might be preferred to the other, in practice. (5 pts) 4

CS 188 Spring 2011 Introduction to Artificial Intelligence Midterm Exam Solutions Question 3 Ordered Pacman Search 25 points Q1. [11 pts] Foodie Pacman Consider a new Pacman game where there are two kinds of food pellets, each with a different There are two color kinds (red of food and blue). pellets, Pacman each with has a peculiar different eating color (red habits; and he blue). strongly Pacman prefers is only to eat interested all of in tasting the two different the kinds red dots of food: before the eating game any ends of when the blue he has ones. eaten If 1 Pacman red pellet eats and a blue 1 blue pellet pellet while (though a red Pacman may eat more than one one remains, of each he pellet). will incur Pacman a cost has of four 100. actions: Otherwise, moving as before, up, down, there left, is a or cost right, of 1 for each does not have a stay action. step There andare thekgoal redispellets to eatand all the K blue dots. pellets, There and are Kthe red dimensions pellets andofkthe blue board pellets, are Nand bythe M. dimensions of the board are N by M. K = 3, N = 4, M =4 (a) [1 pt] Give an efficient state space formulation of this problem. Specify the domain of each variable in your state space. 1. Give a non-trivial upper bound on the size of the state space required to model this problem. Briefly describe your reasoning. [10 pts] (x [1 : N],y [1 : M], eaten R {T,F}, eaten B {T,F}) (b) [2 pts] Give a tight upper bound on the size of the state space. 4 N M 2. Give a non-trivial upper bound on the branching factor of the state space. Briefly (c) [2 pts] Give adescribe tight upper yourbound reasoning. the [5 branching pts] factor of the search problem. 4 (d) [1 pt] Assuming Pacman starts the game in position (x,y), what is the initial state? 3. Name a search algorithm pacman could execute to get the optimal path? Briefly justify (x, y, F, F ) your choice (describe in one or two sentences) [5 pts] (e) [1 pt] Define a goal test for the problem. (eaten R == T )&&(eaten B == T ) 4. Give an admissible heuristic for this problem. [5 pts] (f) [4 pts] For each of the following heuristics, indicate (yes/no) whether or not it is admissible (a correct answer is worth 1 point, leaving it blank is worth 0 points, and an incorrect answer is worth -1 points). Heuristic Admissible? The number of pellets remaining 5 No The smallest Manhattan distance to any remaining pellet No The maximum Manhattan distance between any two remaining pellets No The minimum Manhattan distance between any two remaining pellets of opposite colors No

Question 4 Game Trees 30 points Consider the following game tree, which has min (down triangle), max (up triangle), and expectation (circle) nodes: 0.5 0.5 0.5 0.5 2 2 1 2 0 2-1 0 1. In the figure above, label each tree node with its value (a real number). [7 pts] 2. In the figure above, circle the edge associated with the optimal action at each choice point. [7 pts] 3. If we knew the values of the first six leaves (from left), would we need to evaluate the seventh and eighth leaves? Why or why not? [5 pts] 4. Suppose the values of leaf nodes are known to be in the range [ 2, 2], inclusive. Assume that we evaluate the nodes from left to right in a depth first manner. Can we now avoid expanding the whole tree? If so, why? Circle all of the nodes that would need to be evaluated (include them all if necessary). [11 pts] 6

Question 5 Tree Search 30 points Given the state graph below, run each of the following algorithms and list the order that the nodes are expanded (a node is considered expanded when it is dequeued from the fringe). The values next to each edge denote the cost of traveling between states. Tree Search Use alphabetical ordering to break ties (i.e. A should be before B in the fringe, all of Given the state graph below, run each of the following algorithms list the order that the things being equal). It is also possible that a state may expanded more than once. However, nodes are explored (a node is considered explored when it is dequeued from the fringe). The you should use cyclevalues checking next to each toedge ensure denote the you cost do of travelling not go between intostates. an infinite loop (e.g. never expand the same state twice Any in and aall ties single should plan be broken from by alphabetical the root ordering to(i.e. aa leaf should node). be added to Every the fringe ordering should always start with the before start B). It is node also possible andthat end a state with may occur themore goal than node. once in the order explored. Every ordering should always start with the start node and end with the goal node. 1. Breadth first search [5 pts] a) Depth First Search: A, C, E, F, G b) Iterative Deepening Search: A, A, C, B, A, C, E, B, G c) Uniform Cost Search: A, B, C, E, D, F, G 2. Depth first search [5 pts] 3. Iterative deepening [5 pts] 4. Uniform cost search [5 pts] 7

Now, considergiven the following the following twoheuristics: State s H1(s) H2(s) A (start) 10 12 B 8 11 C 7 8 D 4 4 E 3 4 F 2 3 G (goal) 0 0 d) Give the ordering of nodes explored using A* search and heuristic H2 (remember ties should 5. Provide the expansion be broken ordering by alphabetical for Aordering) search with heuristic H2 (again breaking ties alphabetically). [5 pts] A, B, D, C, E, F, G e) Is hueristic h1 admissible? Yes No Is heuristic h2 consistent? Yes No 6. List which, if any, Is heuristic of the two h3 admissible? heuristics are admissible Yes [2.5 pts] No Is heuristic h4 consistent? Yes No 7. List which, if any, of the two heuristics are consistent [2.5 pts] 8

Question 6 Stutter Step MDP and Bellman Equations 25 points Consider the following special case of the general MDP formulation we studied in class. Instead of specifying an arbitrary transition distribution T (s, a, s ), the stutter step MDP has a function T (s, a) that returns a next state s deterministically. However, when the agent actually acts in the world, it often stutters. It only actually reaches s half of the time, and it otherwise stays in s. The reward R(s, a, s ) remains as in the general case. 1. Write down a set of Bellman equations for the stutter step MDP in terms of T (s, a), by defining V (s), Q (s, a) and π (s). Be sure to include the discount γ. [25 pts] 9

2. Consider the special case of the stutter step MDP where R(s, a, s ) is zero for all states except for a single good terminal state, which has reward 1, and a single bad terminal state, with reward -100. Furthermore, assume all states s are connected to both terminal states (there exists some sequence of actions that will go from s to the terminal state with non-zero probability). If γ = 1, briefly describe what the optimal values V (s) for all states would look like. [5 pts] 3. Again, set the rewards as in the previous question, but now consider γ = 0.1 and describe V (s). Would the optimal policy π (s) change? [5 pts] 10