ISudoku. Jonathon Makepeace Matthew Harris Jamie Sparrow Julian Hillebrand

Similar documents
isudoku Computing Solutions to Sudoku Puzzles w/ 3 Algorithms by: Gavin Hillebrand Jamie Sparrow Jonathon Makepeace Matthew Harris

of Nebraska - Lincoln

Spring 06 Assignment 2: Constraint Satisfaction Problems

Spring 06 Assignment 2: Constraint Satisfaction Problems

8. You Won t Want To Play Sudoku Again

SUDOKU X. Samples Document. by Andrew Stuart. Moderate

Welcome to the Sudoku and Kakuro Help File.

Solving Sudoku Using Artificial Intelligence

Python for education: the exact cover problem

The Mathematics Behind Sudoku Laura Olliverrie Based off research by Bertram Felgenhauer, Ed Russel and Frazer Jarvis. Abstract

A Retrievable Genetic Algorithm for Efficient Solving of Sudoku Puzzles Seyed Mehran Kazemi, Bahare Fatemi

Investigation of Algorithmic Solutions of Sudoku Puzzles

Cracking the Sudoku: A Deterministic Approach

Kenken For Teachers. Tom Davis January 8, Abstract

On the Combination of Constraint Programming and Stochastic Search: The Sudoku Case

Comparing Methods for Solving Kuromasu Puzzles

The most difficult Sudoku puzzles are quickly solved by a straightforward depth-first search algorithm

Rating and Generating Sudoku Puzzles Based On Constraint Satisfaction Problems

The puzzle Sudoku has become the passion

Techniques for Generating Sudoku Instances

An Exploration of the Minimum Clue Sudoku Problem

Lecture 20: Combinatorial Search (1997) Steven Skiena. skiena

The remarkably popular puzzle demonstrates man versus machine, backtraking and recursion, and the mathematics of symmetry.

Automatically Generating Puzzle Problems with Varying Complexity

DOWNLOAD OR READ : SUDOKU LARGE PRINT PUZZLE BOOK FOR ADULTS 200 MEDIUM PUZZLES PUZZLE BOOKS PLUS PDF EBOOK EPUB MOBI

CSE548, AMS542: Analysis of Algorithms, Fall 2016 Date: Sep 25. Homework #1. ( Due: Oct 10 ) Figure 1: The laser game.

Topic 10 Recursive Backtracking

UNIVERSITY of PENNSYLVANIA CIS 391/521: Fundamentals of AI Midterm 1, Spring 2010

BMT 2018 Combinatorics Test Solutions March 18, 2018

KenKen Strategies. Solution: To answer this, build the 6 6 table of values of the form ab 2 with a {1, 2, 3, 4, 5, 6}

N-Queens Problem. Latin Squares Duncan Prince, Tamara Gomez February

Yet Another Organized Move towards Solving Sudoku Puzzle

CS188: Section Handout 1, Uninformed Search SOLUTIONS

KenKen Strategies 17+

Solving and Analyzing Sudokus with Cultural Algorithms 5/30/2008. Timo Mantere & Janne Koljonen

Informatica Universiteit van Amsterdam. Performance optimization of Rush Hour board generation. Jelle van Dijk. June 8, Bachelor Informatica

CS 188 Fall Introduction to Artificial Intelligence Midterm 1

This chapter gives you everything you

Take Control of Sudoku

Solving Sudoku with Genetic Operations that Preserve Building Blocks

A comparison of a genetic algorithm and a depth first search algorithm applied to Japanese nonograms

INTRODUCTION TO COMPUTER SCIENCE I PROJECT 6 Sudoku! Revision 2 [2010-May-04] 1

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Module 6 Lecture - 37 Divide and Conquer: Counting Inversions

Game Theory and Randomized Algorithms

Past questions from the last 6 years of exams for programming 101 with answers.

Nested Monte-Carlo Search

Problem 4.R1: Best Range

Python for Education: The Exact Cover Problem

10/5/2015. Constraint Satisfaction Problems. Example: Cryptarithmetic. Example: Map-coloring. Example: Map-coloring. Constraint Satisfaction Problems

You ve seen them played in coffee shops, on planes, and

CPSC 217 Assignment 3

An improved strategy for solving Sudoku by sparse optimization methods

Pattern Avoidance in Unimodal and V-unimodal Permutations

Comp th February Due: 11:59pm, 25th February 2014

Sudoku Solver Version: 2.5 Due Date: April 5 th 2013

Universiteit Leiden Opleiding Informatica

SudokuSplashZone. Overview 3

In order for metogivebackyour midterms, please form. a line and sort yourselves in alphabetical order, from A


Sudoku Solvers. A Different Approach. DD143X Degree Project in Computer Science, First Level CSC KTH. Supervisor: Michael Minock

Taking Sudoku Seriously

2048: An Autonomous Solver

Comparing BFS, Genetic Algorithms, and the Arc-Constancy 3 Algorithm to solve N Queens and Cross Math

Games and Adversarial Search II

UN DOS TREZ Sudoku Competition. Puzzle Booklet for Preliminary Round. 19-Feb :45PM 75 minutes

Griddler Creator. Supervisor: Linda Brackenbury. Temitope Otudeko 04/05

Chapter 4 Heuristics & Local Search

A Novel Multistage Genetic Algorithm Approach for Solving Sudoku Puzzle

CPSC 217 Assignment 3 Due Date: Friday March 30, 2018 at 11:59pm

Tic-Tac-Toe and machine learning. David Holmstedt Davho G43

More Recursion: NQueens

A Group-theoretic Approach to Human Solving Strategies in Sudoku

CMPT 310 Assignment 1

Graphs of Tilings. Patrick Callahan, University of California Office of the President, Oakland, CA

Algorithm Performance For Chessboard Separation Problems

arxiv: v2 [math.ho] 23 Aug 2018

Recent Progress in the Design and Analysis of Admissible Heuristic Functions

Three of these grids share a property that the other three do not. Can you find such a property? + mod

MA/CSSE 473 Day 14. Permutations wrap-up. Subset generation. (Horner s method) Permutations wrap up Generating subsets of a set

It Stands to Reason: Developing Inductive and Deductive Habits of Mind

Latin squares and related combinatorial designs. Leonard Soicher Queen Mary, University of London July 2013

Taking the Mystery Out of Sudoku Difficulty: An Oracular Model

Problem C The Stern-Brocot Number System Input: standard input Output: standard output

ProCo 2017 Advanced Division Round 1

CS/COE 1501

LMI Monthly Test May 2010 Instruction Booklet

5CHAMPIONSHIP. Individual Round Puzzle Examples SUDOKU. th WORLD. from PHILADELPHIA. Lead Sponsor

Twenty-fourth Annual UNC Math Contest Final Round Solutions Jan 2016 [(3!)!] 4

MA/CSSE 473 Day 13. Student Questions. Permutation Generation. HW 6 due Monday, HW 7 next Thursday, Tuesday s exam. Permutation generation

HP-71B Sudoku Solver s Sublime Sequel

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of

Algorithmique appliquée Projet UNO

Outline for today s lecture Informed Search Optimal informed search: A* (AIMA 3.5.2) Creating good heuristic functions Hill Climbing

We hope you enjoy the set. Good luck for the Indian Puzzle Championship! 3 A B C 4 H D 5 G F E 7 A B 8 H 9 G F

Lecture 6: Latin Squares and the n-queens Problem

A Genetic Algorithm for Solving Beehive Hidato Puzzles

Chapter 5 Backtracking. The Backtracking Technique The n-queens Problem The Sum-of-Subsets Problem Graph Coloring The 0-1 Knapsack Problem

Factorization of permutation

In the game of Chess a queen can move any number of spaces in any linear direction: horizontally, vertically, or along a diagonal.

Written examination TIN175/DIT411, Introduction to Artificial Intelligence

Transcription:

Jonathon Makepeace Matthew Harris Jamie Sparrow Julian Hillebrand ISudoku Abstract In this paper, we will analyze and discuss the Sudoku puzzle and implement different algorithms to solve the puzzle. After analyzing each algorithm, we will determine the efficiency of each and determine which is the best implementation. In this paper, we will state our methods to test each algorithm and list our constraints that will lead to our conclusion. 1. Introduction Sudoku is a puzzle game often found in newspapers or magazines. The object of the game is to place numbers in a grid such that certain conditions are met. On paper, many people can solve a Sudoku puzzle given enough time and it wouldn t be too difficult. However, some puzzles can be too difficult for any human to solve. In this paper, we will analyze four algorithms for computing solutions to Sudoku puzzles. 2. Informal Statement of Problem Traditionally, the object of Sudoku is, given a minimum of 17 clue numbers already placed in the cells of a 9 9 grid subdivided into 3 3 regions, place the integers 1-9 in each empty cell such that no row, column, nor region of cells contain a duplicate digit. This definition is not strict, however; other symbols may be used in place of integers, the number of unique symbols may vary, and a puzzle does not need to be 9 9 size. However, for our implementation purposes, we focused on the traditional 9 9 puzzle grid. 3. Formal Statement of Problem The problem definition of solving a Sudoku puzzle can be expressed formally as a constraint satisfaction problem with three constraints. The first constraint requires that all numbers 1 through n be placed in a row with n cells; the second and third constraints require the same for the puzzle s columns

and regions, respectively. Given an n n Sudoku grid, the Sudoku is successfully solved if: x N and i N, with x n and i n: j N (with j n) such that numberat(i, j) = x. ( For any given number x [between 1 and n] and row number i [between 1 and n], there is a column number j [between 1 and n] such that x can be found in row i, column j. ) x N and j N, with x n and j n: i N (with i n) such that numberat(i, j) = x. ( For any given number x and column number j, there is a row number i...) x N and r N, with x n and r n: i N (with i n) and j N (with j n) such that regionof(i,j) = r AND numberat(i,j) = x. 4. Backtracking A common naive algorithm for solving a Sudoku puzzle is brute-force backtracking. For an n n puzzle board, backtracking generates all possible configurations of n symbols (most commonly, represented by the integers in the range [1, n ]) to fill each empty cell. Each of these configurations is tested until a solution to the puzzle is found. The idea of backtracking is to exhaust all possibilities in a large amount of candidates and pick the first answer that is valid. Using recursion, the algorithm generates a depth-first tree of possibilities that grows exponentially with each blank cell to be filled. Our recursive implementation of backtracking in Python is shown in Figure 4.1. def backtrackingsolve( self ): c = None for cell in self.cells: # Iterate over each cell in the puzzle if cell.num is None : c = cell # Work with the first blank cell found. break if c is None : # If no empty cell is found, the puzzle is solved return True for candidate in range ( 1, 10 ): if c.isvalidcandidate(candidate): c.num = candidate if self.backtrackingsolve(): # Recursively branch down possibilities for cell c return True c.num = None return False Figure 4.1 By default, backtracking is not an efficient algorithm due to its naive implementation. At each empty cell on an n

n board, backtracking takes at worst O( n ) attempts to find a valid symbol for that cell that does not violate any of the three constraints (iterates through every symbol). In the best case, the first symbol tested in any empty cell (in our case, the integer 1) happens to be a valid candidate (does not violate any of the three constraints), and the cell is filled in O(1) constant time. After a valid candidate for the empty cell is found, it is placed in the cell. The algorithm branches down the decision tree at that cell and recursively attempts to find a valid symbol for the next empty cell, backtracking to the previous cell if one is not found. Therefore, in the worst case, each subsequent empty cell must work through n possibilities for each of the n possibilities of the previous cell (a total of n 2 possibilities for two empty cells). The complexity thus increases exponentially for every empty cell in the puzzle. The average complexity of the backtracking algorithm is then O( n m ), where m is the initial number of empty cells (or n 2 - (# of clue numbers)). Figure 4.2 Figure 4.2 shows the increase in average run times for the backtracking algorithm when it is given 10 puzzles of each of 3 difficulties. The easy, medium, and hard puzzles tested initially contain 40, 33, and 26 clue numbers, respectively. 5. Dancing Links Dancing Links is not the easiest method one could implement to solve a sudoku puzzle. However, the ingenious strategy, developed by Donald Knuth, applies perfectly to this type of problem. He developed this technique to implement his Algorithm X, which is used to solve Exact Cover problems.. An Exact Cover problem requires one solution to satisfy all given constraints exactly once, like our Sudoku puzzle. When applying Algorithm X, this requires a double linked list of all

possibilities and an efficient use of backtracking. When creating the double linked list, it is set up as a sparse matrix, which creates a node for each non-zero entry and links it to its neighbors of the same row and column. Then, when running the algorithm, we begin by covering a node and then covering all other nodes in its set, or web of linked nodes. If there are nodes that can t be covered by the set, it is not the solution, and our algorithm backtracks by uncovering the nodes in the reverse order of how they were covered. If there are no remaining nodes however, then our set must be a solution. Figure 4.3 Dancing Links has a complexity of O( n 3 ) or O(n*n*n) as it is essentially sorting through a grid or list (n) of double linked lists (n*n). As the puzzle difficulty increased, the overall complexity of the algorithm did not. Of course, the specific set that would be the solution became more difficult to discover, which led to an increase in run time. 6. Crook s Algorithm One of the most simple and easy to understand algorithms to use when solving the Sudoku puzzle was developed by J.F. Crook. Crook s algorithm was designed to be a pen and paper method to solve the puzzle and can be accomplished by anyone given enough time. It s very important to look at the definition when solving a Sudoku puzzle, as it will provide insight on how to find the answer. The solution of a Sudoku puzzle requires that every row, column, and box contain all the numbers in the set [ 1, 2,..., 9 ] and that every cell be occupied by one and only one number( Crook Source). Meaning that only one copy of each digit can be in every set, whether it be vertical or horizontal. Using Crook s algorithm, we must keep a preemptive set of possible values that could be be inside the cell. For each cell, a mark-up is created to determine the possible values, on pen and paper it is used for visual

reference to keep track of the values in the game. After determining the markup, the user must pick a cell to start with to begin the process. The next step can requires the user to pick a number out of the markup values. Doing this will repeat the previous mentioned step, where you go through each cell on the row and column and take out the entered value from the possible markup values. Repeat the process till all the solutions have been determined and qualify for a solution based on the Sudoku puzzle definition. These steps can further be simplified to the following steps: 1. Mark up the cells ie. list of numbers that the cell may contain. 2. Look at each column, row and 3x3 box and break down into preemptive sets. Break down the sets and use 1 Occupancy Theorem whenever possible. 3. Determine if puzzle is: Finished, not possible or next step. 4. Choose an empty cell and mark a number and color. Repeat step 2 1 Let X be a pre-emptive set in a Sudoku puzzle markup. Then every number in X that appears in the markup of cells not in X over the range of X cannot be a part of the puzzle solution. until no more preemptive sets remain. 5. Go through the markup until you solve or determine as not possible. Figure 4.4 Crook s algorithm is basically a trial and error type of algorithm. After initializing the markup, it goes through each possibility until it finds a solution. It can be considered an exhausted search, but it has different elements associated with it. We determined that the big O of this algorithm is N^2 for the average case, because the algorithm must sort through two different sets to reach the markup in each cell. The complexity increases at a rate that is near that of backtracking. There is some slight deviations in the data, where it takes more time to solve easy puzzles compared to the other algorithms. This could be caused by the design of the algorithm, being that it

has to go through the possibilities until it finds the first solution. As mentioned previously in the backtracking section, as the difficulty increased, so did the amount of time exponentially increase. The main cause of this is from the increase of blank spaces present as the difficult goes up. There were also factors that affect the amount of possible answers in the final solution to a puzzle. Some instances of the puzzle only had a single solution, where these types of puzzles are called diabolical. Each number has to be in a specific spot in the puzzle for the solution to be found. Crook s algorithm would take an exceedingly long amount of time compared to other puzzles (as seen in Figure 4.4), because of the amount of sets each iteration would have to go through and validate for correctness. Genetic Algorithm A genetic algorithm is a general way to solve optimization problems. The basic algorithm is very simple: 1. Create a population (vector) of random solutions (represented in a problem specific way, but often a vector of floats or ints) 2. Pick a few solutions and sort them according to fitness 3. Replace the worst solution with a new solution, which is either a copy of the best solution, a mutation (perturbation) of the best solution, an entirely new randomized solution or a cross between the two best solutions. These are the most common evolutionary operators, but you could dream up others that use information from existing solutions to create new potentially good solutions. 4. Check if you have a new global best fitness, if so, store the solution. 5. If too many iterations go by without improvement, the entire population might be stuck in a local minimum (at the bottom of a local valley, with a possible chasm somewhere else, so to speak). If so, kill everyone and start over at 1. 6. Go to 2. Fitness is a measure of how good a solution is, lower meaning better. This measure is performed by a fitness function that you supply. Writing a fitness function is how you describe the problem to the GA.

The magnitude of the fitness values returned does not matter (in sane implementations), only how they compare to each other. There are other, subtly different, ways to perform the evolutionary process. Some are good and some are popular but bad. The one described above is called tournament selection and it is one of the good ways. Much can be said about the intricacies of GA but it will have to be said somewhere else, lest I digress completely. And within a few minutes (about 2.6 million iterations when I tried), the correct answer pops out! The nice thing about this method is that you do not have to know anything about how to solve a Sudoku puzzle or even think very hard at all. Note that I did not even bother to just let it search for the unknown values - it also has to find the digits that we already know (which should not be too hard with a decent fitness function, see below). The only bit of thinking we did was to understand that a Sudoku solution has to be a permutation of [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9], but this merely made the evolution part faster. If we wanted to make it faster still, we could make a genome type that let us say that there are actually nine separate vectors who are each guaranteed to be a permutation of 1 to 9. We could have thought even less and represented the solution by 81 ints who are all in the range 1 to 9, by using another genome type: >> genome = stdgenomes.enumgenome(81, range(1,10)) The range argument to EnumGenome does not have to be a vector of integers, it could be a vector of any objects, since they are never treated like numbers. In my experiment this took maybe 15-30 minutes to solve. For more difficult Sudoku puzzles, I would definitely go with the permutation genome, since using EnumGenome increases the search space to 9^81 possible solutions. 7. Conclusion Our initial estimates concluded that Dancing Links would be the most efficient

method out of the ones listed. But with the complexity found to be larger than that of the other algorithms, we grew unsure. Looking at the data, it actually seems to be the slowest of the algorithms. Backtracking and Crooks algorithms, per the data, are quite similar. Though their algorithms require a thorough check of all empty cells and possibilities, Dancing Links starts with all possibilities linked together in a large matrix, and runs through sets cutting down those possibilities. cking-6613d33229af [Accessed 7 Dec. 2018]. 8. References Crook, J. (2009). A Pencil-and-Paper Algorithm for Solving Sudoku Puzzles. [PDF file] Retrieved from https://www.ams.org/notices/200904/tx0904 00460p.pdf Knuth, D. (2000). Dancing Links. [PDF file] Retrieved from https://arxiv.org/pdf/cs/0011047v1.pdf Zibbu, Shirsh. (2018). Sudoku and Backtracking - Hacker Noon. [online] Available at: https://hackernoon.com/sudoku-and-backtra