## Solving The GCHQ Christmas Puzzle As A MIP With Python## IntroductionYou may have seen that Britain security and intelligence organisation has proposed a grid-shading puzzle also known as nonogram. Let me cite their puzzle definition.
Let's try to solve this as a mixed integer programming (MIP). I will solve the MIP using the CPLEX cloud service (DOcloud) from IBM, but any decent MIP solver should do the job. DOcloud provides a free one month trial you can use to solve this puzzle. All you need is to register on this page. I will furthermore use Python to represent and solve the problem. I'll be using the Pyth Enough discussion, let's get started. The complete code for solving the puzzle is available in a notebook available on github and nbviewer. ## DataThe data is stored in Python lists. Let me show the start of the data code
## VariablesA MIP is defined by three components: - decision variables
- constraints
- objective function
The latter defines what we try to optimize. Here, we are only looking for a solution, and there is nothing to optimize. We'll therefore ignore the objective. CPLEX will use a constant objective if we don't provide one. Defining decision variables is often the trickiest part when modeling a problem as a MIP. Here is a rule of thumb I am always using. We must have decision variables that represent the main decisions we have to make. We must also have decision variables that represent the solution. Let's start with the latter first. We must represent the grid solution. The simplest way is to create one binary variable for each cell of the grid. A binary variable can only be set to 0 or to 1. We'll say that the binary variable grid[i][j] is set to 1 if the cell i,j in the grid is black, and 0 otherwise. The goal of solving the problem will be to set these decision variables to 0 or 1 so that the resulting grid is a solution of the GCHQ puzzle. Decision variables are easy to set in our Python modeling api. We first initialize our environment, then we create a grid object containing our variables. model = Model('GCHQ', docl Note that I pass a string to the decision variable constructor. This string will become the name of the decision variable. Passing a name string isn't mandatory, but I find it useful for debugging. Names are used when printing the variables. They are also used when dumping the model as a lp file for instance. Here, the variable name for grid[i][j] will be x_i_j. Are these decision variables enough to represent all the decisions we have to make? Not really. We also need to represent where we put the blocs of black cells corresponding to the clues provided for each row and for each column of the puzzle. Let's represent these decisions with binary variables again: - row_vars[i][c][j] is set to:
1 if the bloc corresponding to the c-th clue in the i-th row starts at the j-th column, 0 otherwise - col_vars[j][c][i] is set to:
1 if the bloc corresponding to the c-th clue in the j-th column starts at the i-th row, 0 otherwise
The code for creating row_vars is row_vars = [[ The name for row_vars[i][c][j] is r_i_c_j_length where length is the length of the corresponding bloc. We end up with 8550 binary decision variables. ## ConstraintsLet's look at constraints. These must restrict the value sthat the decision variables can take in order to satisfy the puzzle specification.
First, the grid variables corresponding to the black cells should be set to 1: for (i,j) in grid_clues: mode Then we have a bunch of constraints for each row clues: For each row clue: - There must be a bloc of consecutive black cells in the corresponding row.
- That bloc must start at least clue value cells before the right edge of the grid.
- That bloc must be of length clue value.
For each row: - The blocs must appear in the same order as the clues.
For each grid cell - The cell is set to 1 if, and only if there is a row bloc covering that cell
A corresponding set of constraints must be set for column clues. We are using MIP, which means that we can only state constraints that are linear. Constraints must be of the form expr R rh where exp is a weighted sum of decision variables, R is one of <=, ==, >= and rhs is a constant. Modeling with linear constraints can be tricky, but one gets used to it rapidly. A good textbook on this is Mode Let's see how we can model our problem with linear constraints only. The code for setting the constraints is the following. We start by looping on each row and set useful local variables. for i,row_clue in enum For each clue ... for c,length in enum A bloc must be set ... mode The bloc must start at least length cells before the right edge ... for j in rang The corresponding cell grid must be set to 1 ... for j in rang If a bloc in row i of length length starts at column j, then the next bloc cannot start before column j+length+1. for i,row_clue in enum A grid cell is set to 1 if and only if there is a bloc covering it for i,row_clue in enum We used usual MIP modeling tricks. The first equality (a sum of binary variables equal to 1) means that one of these variables must be set to 1. The constraint of the form a <= b means that a implies b. It is a linear constraint as it can be written as a - b <= 0. The last constraint is linear as it is equivalent to mode
We must set similar constraints for the column clues to be complete. We end up with 116,631 constraints. ## SolutionOnce all constraints are set we solve the problem by calling our solver, here CPLEX on DOcloud. As said before, any decent MIP solver should do the job. In our case we first print some statistics about the problem, solve it, then print a summary of the result. mode It yields model: GCHQ
We can now display the result. We first store the decision variables values in a 2D Numpy array. import numpy as np
We then display this array as an image using matplotlib imshow() function. Note that we set a low dpi value, so that each cell in our grid covers more than one pixel in the image. We also need to set the interpolation method to be sure that our image is displayed as we intend to. from matplotlib import pyplot as plt %matplotlib inlinedpi = 4 width = nrows/dpi height = ncols/dpi fig, ax = pl
It yields a QRcode image: This is the solution of the nonogram, but it is not the end of the story. As GCHQ puts it:
## MIP Or SAT?I used MIP but other techniques could be used to solve this nonogram. Constraint programming is probably a good bet as well, but unfortunately, our Python modeling api does not support it yet. Matthew Earl has used a sat Does it mean that MIP is better than SAT? It depends on how you define 'better'. Regarding performance, both approaches solve the problem quickly. If you look at ease of modeling, then I would prefer MIP, because we can state constraints more compactly. But SAT experts will probably find it easier to state it as SAT than MIP. If you define it by size of the code, then my code is shorter than Matt Earl's code. And I find mine easier to understand, but I may be biased here... I think it is fair to say that both approaches are good enough for this nonogram. ## A Better ModelLet me conclude with a brief discussion of the MIP model. We could make it more efficient by not creating variables that correspond to invalid bloc starting points. In the above model, we create them, then set them to 0. I did it because it makes the code simpler to write, and less error prone. We can further improve the model as follows. The c-th row bloc in row i must be after the blocs that correspond to the first c-1 clues for row i. If we pack these blocs to the left, they use c-1 white space plus sum of the c-1 first clues black space. Our c-th bloc must start after that. A similar reasoning provides a limit on how fa to the right the bloc can be. By setting the corresponding variables to zero we can make the problem much easier to solve. I'll let readers ponder this.
Alex Fleischer has also built a CP model in OPL that can be accessed here. |