Solving the hardest Sudoku - part 1Do you know the hardest Sudoku problem? Do you know the best way to solve it? Before answering these questions, let me remind you of what the Sudoku puzzle is about in case you haven't read a newspaper in the last decade (adapted from wikipedia): The objective is to fill a 9×9 grid with digits so that the digits in each column, each row, and each of the nine 3×3 sub-grids that compose the grid (also called ""blocks") are pairwise different. The puzzle setter provides a partially completed grid, which for a well-posed puzzle has a unique solution.
Arto Inkala, a finish mathematician, claimed to have invented the hardest Sudoku last year. His finding made it to Yahoo's front page and numerous newspaper articles like this one. Looking at several publications, including spec
How was the difficulty assessed? I won't pretend I fully understand Inkala's way to asses it, but it seems to be an estimate of how large the search space is for a given Sudoku problem: the higher, the more difficult. And the above problem ranked best on his hardness scale. See this comment for another interpretation. Now, let us look at how to solve this problem. I will use a methodology which is common knowledge to Operations Research practitioners. This methodology is quite simple: apply a generic algorithm to a model of the problem to solve. Let me explain each step in detail using this Sudoku problem. Algorithm
We read more and more often that algorithms are ruling the world, see for instance How Model
For the sake of this article, a model is an abstraction of the problem to be solved which is amenable to computation by a generic algorithm. Some will say that a model is an abstraction amenable to mathematical treatment. I won't argue here, as I find the two definitions to be close enough. People interested in mathematical modeling in general can read Math In our case, we need to express the essence of Sudoku problems in a way amenable to resolution by our generic algorithm. We do it in two steps. First, we define the quantities to be computed. These are called decision variables in our jargon. For the Sudoku, the following defines the 9x9 decision variables, one per cell of the Sudoku board. dvar int sudoku[1..9, 1..9] in 1..9 ; The decision variable sudoku[i,j] will represent the digit in the cell in row i and column j of the board. The above statement is written in OPL. OPL is the mathematical language in which we express the model. It is part of CPLEX. Next, we need to specify the properties of the solution of the Sudoku. We simply translate the definition of the puzzle we gave above. The digits in each column are pairwise different. forall (j in 1..9) allDifferent (all(i in 1..9) sudoku [i,j]); The digits in each row are pairwise different. forall (i in 1..9) allDifferent (all (j in 1..9) sudoku[i,j]); The digits in each of the nine 3×3 sub-grids that compose the grid (also called ""blocks") are pairwise different.
forall ( block_row , block_column in 0..2) The puzzle setter provides a partially completed grid forall (i, j in 1..9 : input [i, j] != 0) sudoku[i, j] == input[i, j]; We simply set the values given as input for the corresponding variables (an input of 0 correspond to an unknown cell). Solving Once we have a model for all Sudoku problem, we simply need the input corresponding to Inkala's instance:
int input[1..9, 1..9] = [[8, We can then run our generic algorithm, which solves the problem in about 0.03 second on my laptop. So much for the hardest Sodoku! Conclusion Let me recap: it took me few minutes to write a generic model that can solve all Sudoku problems, whichever they are. Moreover, this model is really efficient to solve. All in all the total problem takes 20 lines, only 8 lines if we exclude input data:
using CP; Although I could claim that our OPL model is particularly elegant, truth is that any decent constraint programming system can express and solve Sudoku in a similar way. Let me also admit that one could probably develop a specialized algorithm that solves Sudoku problem faster than constraint programming (see for instance this paper by Leandro Coelho and Gilbert Laporte). Fine. The point I tried to make here is that writing a human readable model, then using a generic, black box, algorithm, was the most effective way of solving this problem. It took me much less time to write the model and have it solved than to solve the problem directly. I'm not even sure I could solve it by hand in a reasonable time. I'm also certain that developing a specialized algorithm would have taken much longer.
This way of solving hard problems is by no mean limited to Sudoku. It has led to dramatic successes, with companies saving billions of dollars, yet it is n What? What do you say? I see, you're asking about the solution to the above Sudoku. Well, why not download CPLEX for free and run the above model to find out? I know, I'm teasing you. I'll show how to do it in a subsequent post. Update on Feb 19. Added a reference to a paper by Leandro Coelho et al. Added a link to the first comment. Update on April 10, 2015. If you do not have OPL installed on your machine, you can solve the above model via DOcloud. I explain how in this post. |
I take your point, but it does seem to take the fun out of puzzles like this to solve them by machine (at least once one has created the formulation--that in itself is fun, but just the first time). With a suitable CP or IP formulation, any properly constructed Sudoku is trivial for a machine. Once all the variables corresponding to cells with printed values are fixed and the formulation reduced accordingly, the cardinality of the solution set is precisely one.
I fully agree. I think I could instrument our CP solver to report the number of constraint propagation steps required to solve the problem. Would that be consistent with perceived difficulty of problems? I don't know. But that's an interesting way of looking at it.
One way of measuring the difficulty of a sudoku puzzle is to see if branching is required to solve the puzzle in either the CP and/or MIP formulation. For many puzzles, the CP optimizer can solve the problem just via constraint propagation and domain reduction, without branching, and the MIP optimizer can solve the problem in presolve, without requiring the solution of any LP relaxations. For the given puzzle above, branching is required in both cases in order to solve the problem, which in some sense indicates that a human may need to do some form of "branching" in order to find a solution.
Irv, I agree. It is how I interpreted Inkala's way of assessing hardness in the first place: search space size.