2.6. Effective Modelling Practices in MiniZinc

There are almost always multiple ways to model the same problem, some of which generate models which are efficient to solve, and some of which are not. In general it is very hard to tell a priori which models are the most efficient for solving a particular problem, and indeed it may critically depend on the underlying solver used, and search strategy. In this chapter we concentrate on modelling practices that avoid inefficiency in generating models and generated models. For solver-specific recommendations see User Manual/Solver Backends.

2.6.1. Variable Bounds

Finite domain propagation engines, which are the principle type of solver targeted by MiniZinc, are more effective the tighter the bounds on the variables involved. They can also behave badly with problems which have subexpressions that take large integer values, since they may implicitly limit the size of integer variables.

Listing 2.6.1 A model with unbounded variables (grocery.mzn).
var int: item1; 
var int: item2;
var int: item3;
var int: item4;

constraint item1 + item2 + item3 + item4 == 711;
constraint item1 * item2 * item3 * item4 == 711 * 100 * 100 * 100;
  
constraint         0 < item1 /\ item1 <= item2
           /\ item2 <= item3 /\ item3 <= item4;
  
solve satisfy;

output ["{", show(item1), ",", show(item2), ",", show(item3), ",",
        show(item4),"}\n"];

The grocery problem shown in Listing 2.6.1 finds 4 items whose prices in dollars add up to 7.11 and multiply up to 7.11. The variables are declared unbounded. Running

$ minizinc --solver chuffed grocery.mzn

yields

=====UNSATISFIABLE=====

This is because the intermediate expressions in the multiplication are also var int and are given default bounds in the solver \(-1,000,000 \dots 1,000,000\), and these ranges are too small to hold the values that the intermediate expressions may need to take.

Modifying the model so that the items are declared with tight bounds

var 1..711: item1;
var 1..711: item2;
var 1..711: item3;
var 1..711: item4;

results in a better model, since now MiniZinc can infer bounds on the intermediate expressions and use these rather than the default bounds. With this modification, executing the model gives

{120,125,150,316}
----------

Note however that even the improved model may be too difficult for some solvers. Running

$ minizinc --solver g12lazy grocery.mzn

does not return an answer, since the solver builds a huge representation for the intermediate product variables.

Bounding variables

Always try to use bounded variables in models. When using let declarations to introduce new variables, always try to define them with correct and tight bounds. This will make your model more efficient, and avoid the possibility of unexpected overflows. One exception is when you introduce a new variable which is immediately defined as equal to an expression. Usually MiniZinc will be able to infer effective bounds from the expression.

2.6.2. Effective Generators

Imagine we want to count the number of triangles (\(K_3\) subgraphs) appearing in a graph. Suppose the graph is defined by an adjacency matrix: adj[i,j] is true if nodes i and j are adjacent. We might write

int: count = sum ([ 1 | i,j,k in NODES where i < j  /\ j < k
                       /\ adj[i,j] /\ adj[i,k] /\ adj[j,k]]);

which is certainly correct, but it examines all triples of nodes. If the graph is sparse we can do better by realising that some tests can be applied as soon as we select i and j.

int: count = sum ([ 1 | i,j in NODES where i < j  /\ adj[i,j],
                        k in NODES where j < k /\ adj[i,k] /\ adj[j,k]]);

You can use the builitin trace function to help determine what is happening inside generators.

Tracing

The function trace(s,e) prints the string s before evaluating the expression e and returning its value. It can be used in any context.

For example, we can see how many times the test is performed in the inner loop for both versions of the calculation.

int:count=sum([ 1 | i,j,k in NODES where 
          trace("+", i<j /\j<k /\ adj[i,j] /\ adj[i,k] /\ adj[j,k]) ]);
adj = [| false, true,  true,  false
       | true,  false, true,  false
       | true,  true,  false, true
       | false, false, true,  false |];
constraint trace("\n",true);
solve satisfy;

Produces the output:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
----------

indicating the inner loop is evaluated 64 times while

  int: count = sum ([ 1 | i,j in NODES where i < j  /\ adj[i,j],
                          k in NODES where trace("+", j < k /\ adj[i,k] /\ adj[j,k])]);

Produces the output:

++++++++++++++++
----------

indicating the inner loop is evaluated 16 times.

Note that you can use the dependent strings in trace to understand what is happening during model creation.

int: count = sum( i,j in NODES where i < j /\ adj[i,j])(
       sum([trace("("++show(i)++","++show(j)++","++show(k)++")",1) | 
             k in NODES where  j < k /\ adj[i,k] /\ adj[j,k]]));

will print out each of triangles that is found in the calculation. It produces the output

(1,2,3)
----------

We have to admit that we cheated a bit here: In certain circumstances, the MiniZinc compiler is in fact able to re-order the arguments in a where clause automatically, so that they are evaluated as early as possible. In this case, adding the trace function in fact prevented this optimisation. In general, it is however a good idea to help the compiler get it right, by splitting the where clauses and placing them as close to the generators as possible.

2.6.3. Redundant Constraints

The form of a model will affect how well the constraint solver can solve it. In many cases adding constraints which are redundant, i.e. are logically implied by the existing model, may improve the search for solutions by making more information available to the solver earlier.

Consider the magic series problem from Complex Constraints. Running this for n = 16 as follows:

$ minizinc --solver gecode --all-solutions --statistics magic-series.mzn -D "n=16;"

might result in output

s = [12, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0];
----------
==========

and the statistics showing 89 failures required.

We can add redundant constraints to the model. Since each number in the sequence counts the number of occurrences of a number we know that they sum up to n. Similarly we know that the sum of s[i] * i must also add up to n because the sequence is magic. Adding these constraints gives the model in Listing 2.6.2.

Listing 2.6.2 Model solving the magic series problem with redundant constraints (magic-series2.mzn).
int: n;
array[0..n-1] of var 0..n: s;

constraint forall(i in 0..n-1) (
   s[i] = (sum(j in 0..n-1)(bool2int(s[j]=i))));
% redundant
constraint sum(i in 0..n-1)(s[i]) = n;
constraint sum(i in 0..n-1)(s[i] * i) = n;
solve satisfy;   

output [ "s = ", show(s), ";\n" ] ;

Running the same problem as before

$ minizinc --solver gecode --all-solutions --statistics magic-series2.mzn -D "n=16;"

results in the same output, but with statistics showing just 14 failures explored. The redundant constraints have allowed the solver to prune the search much earlier.

2.6.4. Modelling Choices

There are many ways to model the same problem in MiniZinc, although some may be more natural than others. Different models may have very different efficiency of solving, and worse yet, different models may be better or worse for different solving backends. There are however some guidelines for usually producing better models:

Choosing between models

The better model is likely to have some of the following features

  • smaller number of variables, or at least those that are not functionally defined by other variables
  • smaller domain sizes of variables
  • more succinct, or direct, definition of the constraints of the model
  • uses global constraints as much as possible

In reality all this has to be tempered by how effective the search is for the model. Usually the effectiveness of search is hard to judge except by experimentation.

Consider the problem of finding permutations of \(n\) numbers from 1 to \(n\) such that the differences between adjacent numbers also form a permutation of numbers 1 to \(n-1\). Note that the u variables are functionally defined by the x variables so the raw search space is \(n^n\). The obvious way to model this problem is shown in Listing 2.6.3.

Listing 2.6.3 A natural model for the all interval series problem prob007 in CSPlib (allinterval.mzn).
include "alldifferent.mzn";

int: n;

array[1..n] of var 1..n: x;      % sequence of numbers
array[1..n-1] of var 1..n-1: u;  % sequence of differences

constraint alldifferent(x);
constraint alldifferent(u);
constraint forall(i in 1..n-1)(u[i] = abs(x[i+1] - x[i]));

solve :: int_search(x, first_fail, indomain_min)
      satisfy;
output ["x = \(x);\n"];

In this model the array x represents the permutation of the n numbers and the constraints are naturally represented using alldifferent.

Running the model

$ minizinc --solver gecode --all-solutions --statistics allinterval.mzn -D "n=10;"

finds all solutions in 16077 nodes and 71ms

An alternate model uses array y where y[i] gives the position of the number i in the sequence. We also model the positions of the differences using variables v. v[i] is the position in the sequence where the absolute difference i occurs. If the values of y[i] and y[j] differ by one where j > i, meaning the positions are adjacent, then v[j-i] is constrained to be the earliest of these positions. We can add two redundant constraints to this model: since we know that a difference of n-1 must result, we know that the positions of 1 and n must be adjacent (abs( y[1] - y[n] ) = 1), which also tell us that the position of difference n-1 is the earlier of y[1] and y[n], i.e. v[n-1] = min(y[1], y[n]). With this we can model the problem as shown in Listing 2.6.4. The output statement recreates the original sequence x from the array of positions y.

Listing 2.6.4 An inverse model for the all interval series problem prob007 in CSPlib (allinterval2.mzn).
include "alldifferent.mzn";

int: n;

array[1..n] of var 1..n: y;  % position of each number
array[1..n-1] of var 1..n-1: v; % position of difference i

constraint alldifferent(y);
constraint alldifferent(v);
constraint forall(i,j in 1..n where i < j)(
	   	 (y[i] - y[j] = 1 -> v[j-i] = y[j]) /\
                 (y[j] - y[i] = 1 -> v[j-i] = y[i])
	   );

constraint abs(y[1] - y[n]) = 1 /\ v[n-1] = min(y[1], y[n]);

solve :: int_search(y, first_fail, indomain_min)
      satisfy;

array[1..n] of 1..n: x :: output_only
  = [ sum(i in 1..n)(i*(fix(y[j]) = i) | j in 1..n ];
output [ "x = \(x);\n" ]

The inverse model has the same size as the original model, in terms of number of variables and domain sizes. But the inverse model has a much more indirect way of modelling the relationship between the y and v variables as opposed to the relationship between x and u variables. Hence we might expect the original model to be better.

The command

$ minizinc --solver gecode --all-solutions --statistics allinterval2.mzn -D "n=10;"

finds all the solutions in 98343 nodes and 640 ms. So the more direct modelling of the constraints is clearly paying off.

Note that the inverse model prints out the answers using the same x view of the solution. The way this is managed is using output_only annotations. The array x is defined as a fixed array and annotated as output_only. This means it will only be evaluated, and can only be used in output statements. Once a solution for y is discovered the value of x is calculated during output processing, and hence can be displayed in the output.

Output_only annotation

The output_only annotation can be applied to variable definitions. The variable defined must not be a var type, it can only be par. The variable must also have a right hand side definition giving its value. This right hand side definition can make use of fix functions to access the values of decision variables, since it is evaluated at solution processing time

2.6.5. Multiple Modelling and Channels

When we have two models for the same problem it may be useful to use both models together by tying the variables in the two models together, since each can give different information to the solver.

Listing 2.6.5 A dual model for the all interval series problem prob007 in CSPlib (allinterval3.mzn).
include "inverse.mzn";

int: n;

array[1..n] of var 1..n: x;  % sequence of numbers
array[1..n-1] of var 1..n-1: u;  % sequence of differences

constraint forall(i in 1..n-1)(u[i] = abs(x[i+1] - x[i])); 

array[1..n] of var 1..n: y;  % position of each number
array[1..n-1] of var 1..n-1: v; % position of difference i

constraint inverse(x,y);
constraint inverse(u,v);

constraint abs(y[1] - y[n]) = 1 /\ v[n-1] = min(y[1], y[n]);

solve :: int_search(y, first_fail, indomain_min)
      satisfy;

output ["x = ",show(x),"\n"];

Listing 2.6.5 gives a dual model combining features of allinterval.mzn and allinterval2.mzn. The beginning of the model is taken from allinterval.mzn. We then introduce the y and v variables from allinterval2.mzn. We tie the variables together using the global inverse constraint: inverse(x,y) holds if y is the inverse function of x (and vice versa), that is x[i] = j <-> y[j] = i. A definition is shown in Listing 2.6.6. The model does not include the constraints relating the y and v variables, they are redundant (and indeed propagation redundant) so they do not add information for a propagation solver. The alldifferent constraints are also missing since they are made redundant (and propagation redundant) by the inverse constraints. The only constraints are the relationships of the x and u variables and the redundant constraints on y and v.

Listing 2.6.6 A definition of the inverse global constraint (inverse.mzn).
predicate inverse(array[int] of var int: f,
                  array[int] of var int: invf) =
    forall(j in index_set(invf))(invf[j] in index_set(f)) /\
    forall(i in index_set(f))(
        f[i] in index_set(invf) /\
        forall(j in index_set(invf))(j == f[i] <-> i == invf[j])
    );

One of the benefits of the dual model is that there is more scope for defining different search strategies. Running the dual model,

$ minizinc --solver g12fd --all-solutions --statistics allinterval3.mzn -D "n=10;"

which uses the search strategy of the inverse model, labelling the y variables, finds all solutions in 1714 choice points and 0.5s. Note that running the same model with labelling on the x variables requires 13142 choice points and 1.5s.

2.6.6. Symmetry

Symmetry is very common in constraint satisfaction and optimisation problems. To illustrate this, let us look again at the n-queens problem from Listing 2.5.1. The top left chess board in Fig. 2.6.1 shows a solution to the 8-queens problems (labeled “original”). The remaining chess boards show seven symmetric variants of the same solution: rotated by 90, 180 and 270 degrees, and flipped vertically.

images/queens_symm.svg

Fig. 2.6.1 Symmetric variants of an 8-queens solution

If we wanted to enumerate all solutions to the 8-queens problem, we could obviously save the solver some work by only enumerating non-symmetric solutions, and then generating the symmetric variants ourselves. This is one reason why we want to get rid of symmetry in constraint models. The other, much more important reason, is that the solver may also explore symmetric variants of non-solution states!

For example, a typical constraint solver may try to place the queen in column 1 into row 1 (which is fine), and then try to put the column 2 queen into row 3, which, at first sight, does not violate any of the constraints. However, this configuration cannot be completed to a full solution (which the solver finds out after a little search). Fig. 2.6.2 shows this configuration on the top left chess board. Now nothing prevents the solver from trying, e.g., the second configuration from the left in the bottom row of Fig. 2.6.2, where the queen in column 1 is still in row 1, and the queen in column 3 is placed in row 2. Therefore, even when only searching for a single solution, the solver may explore many symmetric states that it has already seen and proven unsatisfiable before!

images/queens_symm_unsat.svg

Fig. 2.6.2 Symmetric variants of an 8-queens unsatisfiable partial assignment

2.6.6.1. Static Symmetry Breaking

The modelling technique for dealing with symmetry is called symmetry breaking, and in its simplest form, involves adding constraints to the model that rule out all symmetric variants of a (partial) assignment to the variables except one. These constraints are called static symmetry breaking constraints.

The basic idea behind symmetry breaking is to impose an order. For example, to rule out any vertical flips of the chess board, we could simply add the constraint that the queen in the first column must be in the top half of the board:

constraint q[1] <= n div 2;

Convince yourself that this would remove exactly half of the symmetric variants in Fig. 2.6.1. In order to remove all symmetry, we need to work a bit harder.

Whenever we can express all symmetries as permutations of the array of variables, a set of lexicographic ordering constraints can be used to break all symmetry. For example, if the array of variables is called x, and reversing the array is a symmetry of the problem, then the following constraint will break that symmetry:

constraint lex_lesseq(x, reverse(x));

How about two-dimensional arrays? Lexicographic ordering works just the same, we only have to coerce the arrays into one dimension. For example, the following breaks the symmetry of flipping the array along one of the diagonals (note the swapped indices in the second comprehension):

array[1..n,1..n] of var int: x;
constraint lex_lesseq([ x[i,j] | i,j in 1..n ], [ x[j,i] | i,j in 1..n ]);

The great thing about using lexicographic ordering constraints is that we can add multiple ones (to break several symmetries simultaneously), without them interfering with each other, as long as we keep the order in the first argument the same.

For the n-queens problem, unfortunately this technique does not immediately apply, because some of its symmetries cannot be described as permutations of the q array. The trick to overcome this is to express the n-queens problem in terms of Boolean variables that model, for each field of the board, whether it contains a queen or not. Now all the symmetries can be modeled as permutations of this array. Since the main constraints of the n-queens problem are much easier to express with the integer q array, we simply use both models together and add channeling constraints between them, as explained in Multiple Modelling and Channels.

The full model, with added Boolean variables, channeling constraints and symmetry breaking constraints is shown in Listing 2.6.7. We can conduct a little experiment to check whether it successfully breaks all the symmetry. Try running the model with increasing values for n, e.g. from 1 to 10, counting the number of solutions (e.g., by using the -s flag with the Gecode solver, or selecting “Print all solutions” as well as “Statistics for solving” in the IDE). You should get the following sequence of numbers of solutions: 1, 0, 0, 1, 2, 1, 6, 12, 46, 92. To verify the sequence, you can search for it in the On-Line Encyclopedia of Integer Sequences (http://oeis.org).

Listing 2.6.7 Partial model for n-queens with symmetry breaking (full model: nqueens_sym.mzn).
% Map each position i,j to a Boolean telling us whether there is a queen at i,j
array[1..n,1..n] of var bool: qb;

% Channeling constraint
constraint forall (i,j in 1..n) ( qb[i,j] <-> (q[i]=j) );

% Lexicographic symmetry breaking constraints
constraint
    lex_lesseq(array1d(qb), [ qb[j,i] | i,j in 1..n ])
/\  lex_lesseq(array1d(qb), [ qb[i,j] | i in reverse(1..n), j in 1..n ])
/\  lex_lesseq(array1d(qb), [ qb[j,i] | i in 1..n, j in reverse(1..n) ])
/\  lex_lesseq(array1d(qb), [ qb[i,j] | i in 1..n, j in reverse(1..n) ])
/\  lex_lesseq(array1d(qb), [ qb[j,i] | i in reverse(1..n), j in 1..n ])
/\  lex_lesseq(array1d(qb), [ qb[i,j] | i,j in reverse(1..n) ])
/\  lex_lesseq(array1d(qb), [ qb[j,i] | i,j in reverse(1..n) ])
;

2.6.6.2. Other Examples of Symmetry

Many other problems have inherent symmetries, and breaking these can often make a significant difference in solving performance. Here is a list of some common cases:

  • Bin packing: when trying to pack items into bins, any two bins that have the same capacity are symmetric.
  • Graph colouring: When trying to assign colours to nodes in a graph such that adjacent nodes must have different colours, we typically model colours as integer numbers. However, any permutation of colours is again a valid graph colouring.
  • Vehicle routing: if the task is to assign customers to certain vehicles, any two vehicles with the same capacity may be symmetric (this is similar to the bin packing example).
  • Rostering/time tabling: two staff members with the same skill set may be interchangeable, just like two rooms with the same capacity or technical equipment.