5 Linear Arithmetic 5.1 Introduction This chapter introduces decision procedures for conjunctions of linear constraints. An extension of these decision procedures for solving a general linear arithmetic formula, i.e., with an arbitrary Boolean structure, is given in Chap. 11. Definition 5.1 (linear arithmetic). The syntax of a formula in linear arithmetic is defined by the following rules: formula : formula ∧ formula | (formula) | atom atom : sum op sum op : = | ≤ | < sum : term | sum + term term : identifier | constant | constant identifier The binary minus operator a − b can be read as “syntactic sugar” for a + −1b. The operators ≥ and > can be replaced by ≤ and < if the coefficients are negated. We consider the rational numbers and the integers as domains. For the former domain the problem is polynomial, and for the latter the problem is NP-complete. As an example, the following is a formula in linear arithmetic: 3x1 + 2x2 ≤ 5x3 ∧ 2x1 − 2x2 = 0 . (5.1) Note that equality logic, as discussed in Chap. 4, is a fragment of linear arithmetic. Many problems arising in the code optimization performed by compilers are expressible with linear arithmetic over the integers. As an example, consider the following C code fragment: 112 5 Linear Arithmetic for(i=1; i<=10; i++) a[j+i]=a[j]; This fragment is intended to replicate the value of a[j] into the locations a[j+1] to a[j+10]. In a DLX-like assembly language,1 a compiler might generate the code for the body of the loop as follows. Suppose variable i is stored in register R1, and variable j is stored in register R2: R4 ←− mem[a+R2] R5 ←− R2+R1 mem[a+R5] ←− R4 R1 ←− R1+1 /* /* /* /* set set set i++ R4 to a[j] */ R5 to j+i */ a[j+i] to a[j] */ */ Code that requires memory access is typically very slow compared with code that operates only on the internal registers of the CPU. Thus, it is highly desirable to avoid load and store instructions. A potential optimization for the code above is to move the load instruction for a[j], i.e., the first statement above, out of the loop body. After this transformation, the load instruction is executed only once at the beginning of the loop, instead of 10 times. However, the correctness of this transformation relies on the fact that the value of a[j] does not change within the loop body. We can check this condition by comparing the index of a[j+i] with the index of a[j] together with the constraint that i is between 1 and 10: i ≥ 1 ∧ i ≤ 10 ∧ j + i = j . (5.2) This formula has no satisfying assignment, and thus, the memory accesses cannot overlap. The compiler can safely perform the read access to a[j] only once. 5.1.1 Solvers for Linear Arithmetic The simplex method is one of the oldest algorithms for numerical optimization. It is used to find an optimal value for an objective function given a conjunction of linear constraints over real variables. The objective function and the constraints together are called a linear program (LP). However, since we are interested in the decision problem rather than the optimization problem, we cover in this chapter a variant of the simplex method called general simplex that takes as input a conjunction of linear constraints over the reals without an objective function, and decides whether this set is satisfiable. Integer linear programming, or ILP, is the same problem for constraints over integers. Section 5.3 covers Branch and Bound, an algorithm for deciding such problems. 1 The DLX architecture is a RISC-like computer architecture, which is similar to the MIPS architecture [149]. 5.2 The Simplex Algorithm 113 These two algorithms can solve conjunctions of a large number of constraints efficiently. We shall also describe two other methods that are considered less efficient, but can still be competitive for solving small problems. We describe them because they are still used in practice, they are relatively easy to implement in their basic form, and they will be mentioned again later in Chap. 11, owing to the fact that they are based on variable elimination. The first of these methods is called Fourier–Motzkin variable elimination, and decides the satisfiability of a conjunction of linear constraints over the reals. The second method is called Omega test, and decides the satisfiability of a conjunction of linear constraints over the integers. 5.2 The Simplex Algorithm 5.2.1 Decision Problems and Linear Programs The simplex algorithm, originally developed by Danzig in 1947, decides satisfiability of a conjunction of weak linear inequalities. The set of constraints is normally accompanied by a linear objective function in terms of the variables of the formula. If the set of constraints is satisfiable, the simplex algorithm provides a satisfying assignment that maximizes the value of the objective function. Simplex is worst-case exponential. Although there are polynomialtime algorithms for solving this problem (the first known polynomial-time algorithm, introduced by Khachiyan in 1979, is called the ellipsoid method), simplex is still considered a very efficient method in practice and the most widely used, apparently because the need for an exponential number of steps is rare in real problems. As we are concerned with the decision problem rather than the optimization problem, we are going to cover a variant of the simplex algorithm called general simplex that does not require an objective function. The general simplex algorithm accepts two types of constraints as input: 1. Equalities of the form a1 x1 + . . . + an xn = 0 . (5.3) 2. Lower and upper bounds on the variables:2 li ≤ xi ≤ ui , (5.4) where li and ui are constants representing the lower and upper bounds on xi , respectively. The bounds are optional as the algorithm supports unbounded variables. 2 This is in contrast to the classical simplex algorithm, in which all variables are constrained to be nonnegative. ! # "li $ ! # "ui $ 114 5 Linear Arithmetic This representation of the input formula is called the general form. This statement of the problem does not restrict the modeling power of weak linear ! # constraints, as we can transform an arbitrary weak linear constraint L !" R "m $ with !"∈ {=, ≤, ≥} into the form above as follows. Let m be the number of constraints. For the i-th constraint, 1 ≤ i ≤ m: 1. Move all addends in R to the left-hand side to obtain L! !" b, where b is a constant. 2. Introduce a new variable si . Add the constraints L! − si = 0 and si !" b . (5.5) If !" is the equality operator, rewrite si = b to si ≥ b and si ≤ b. The original and the transformed conjunctions of constraints are obviously equisatisfiable. Example 5.2. Consider the following conjunction of constraints: x +y ≥ 2 ∧ 2x −y ≥ 0 ∧ −x +2y ≥ 1 . (5.6) x +y −s1 = 0 ∧ 2x −y −s2 = 0 ∧ −x +2y −s3 = 0 ∧ s1 ≥ 2 ∧ s2 ≥ 0 ∧ s3 ≥ 1 . (5.7) The problem is rewritten into the general form as follows: The new variables s1 , . . . , sm are called the additional variables. The vari! # ables x1 , . . . , xn in the original constraints are called problem variables. "n $ Thus, we have n problem variables and m additional variables. As an optimization of the procedure above, an additional variable is only introduced if L! is not already a problem variable or has been assigned an additional variable previously. 5.2.2 Basics of the Simplex Algorithm It is common and convenient to view linear constraint satisfaction problems as geometrical problems. In geometrical terms, each variable corresponds to a dimension, and each constraint defines a convex subspace: in particular, inequalities define half-spaces and equalities define hyperplanes.3 The (closed) 3 A hyperplane in a d-dimensional space is a subspace with d − 1 dimensions. For example, in two dimensions, a hyperplane is a straight line, and in one dimension it is a point. 5.2 The Simplex Algorithm 115 subspace of satisfying assignments is defined by an intersection of half spaces and hyperplanes, and forms a convex polytope. This is implied by the fact that an intersection between convex subspaces is convex as well. A geometrical representation of the original problem in Example 5.2 appears in Fig. 5.1. y 2x − y ≥ 0 3 2 1 −x + 2y ≥ 0 (C) (A) 1 x+y ≥2 2 (B) 3 4 x Fig. 5.1. A graphical representation of the problem in Example 5.2, projected on x and y. The shaded region corresponds to the set of satisfying assignments. The marked points (A), (B), and (C) illustrate the progress that the simplex algorithm makes, as will be explained in the rest of this section It is common to represent the coefficients in the input problem using an m-by-(n + m) matrix A. The variables x1 , . . . , xn , s1 , . . . , sm are written as a vector x. Following this notation, our problem is equivalent to the existence of a vector x such that Ax = 0 and m ! i=1 li ≤ si ≤ ui , (5.8) where li ∈ {−∞} ∪ Q is the lower bound of xi and ui ∈ {+∞} ∪ Q is the upper bound of xi . The infinity values are for the case that a bound is not set. Example 5.3. We continue Example 5.2. Using the variable ordering x, y, s1 , s2 , s3 , a matrix representation for the equality constraints in (5.7) is   1 1 −1 0 0  2 −1 0 −1 0  . (5.9) −1 2 0 0 −1 Note that a large portion of the matrix in Example 5.3 is very regular: the columns that are added for the additional variables s1 , . . . , sm correspond to ! # "A $ ! # "x $ 116 5 Linear Arithmetic an m-by-m diagonal matrix, where the diagonal coefficients are −1. This is a direct consequence of using the general form. While the matrix A changes as the algorithm progresses, the number of columns of this kind is never reduced. The set of m variables corresponding to these columns are called the basic variables and denoted by B. They are also # called the dependent variables, as their values are determined by those of the ! B, N nonbasic variables. The nonbasic variables are denoted by N . It is convenient " $ to store and manipulate a representation of A called the tableau, which is simply A without the diagonal submatrix. The tableau is thus an m-by-n matrix, where the columns correspond to the nonbasic variables, and each row is associated with a basic variable – the same basic variable that has a “−1” entry at that row in the diagonal sub-matrix in A. Thus, the information originally stored in the diagonal matrix is now represented by the variables labeling the rows. Example 5.4. We continue our running example. The tableau and the bounds for Example 5.2 are: x y s1 1 1 s2 2 −1 s3 −1 2 ≤ s1 0 ≤ s2 1 ≤ s3 2 The tableau is simply a different representation of A, since Ax = 0 can be rewritten into ( ! & ' xi = (5.10) aij xj . xi ∈B xj ∈N When written in the form of a matrix, the sums on the right-hand side of (5.10) correspond exactly to the tableau. 5.2.3 Simplex with Upper and Lower Bounds ! # The general simplex algorithm maintains, in addition to the tableau, an as"α $ signment α : B ∪ N −→ Q. The algorithm initializes its data structures as follows: • • • The set of basic variables B is the set of additional variables. The set of nonbasic variables N is the set of problem variables. For any xi with i ∈ {1, . . . , n + m}, α(xi ) = 0. If the initial assignment of zero to all variables (i.e., the origin) satisfies all upper and lower bounds of the basic variables, then the formula can be declared satisfiable (recall that initially the nonbasic variables do not have 5.2 The Simplex Algorithm 117 ! # Algorithm 5.2.1: General-Simplex Input: A linear system of constraints S Output: “Satisfiable” if the system is satisfiable, “Unsatisfiable” otherwise 1. Transform the system into the general form Ax = 0 and m ! i=1 li ≤ si ≤ ui . Set B to be the set of additional variables s1 , . . . , sm . Construct the tableau for A. Determine a fixed order on the variables. If there is no basic variable that violates its bounds, return “Satisfiable”. Otherwise, let xi be the first basic variable in the order that violates its bounds. 6. Search for the first suitable nonbasic variable xj in the order for pivoting with xi . If there is no such variable, return “Unsatisfiable”. 7. Perform the pivot operation on xi and xj . 8. Go to step 5. 2. 3. 4. 5. " $ explicit bounds). Otherwise, the algorithm begins a process of changing this assignment. Algorithm 5.2.1 summarizes the steps of the general simplex procedure. The algorithm maintains two invariants: • • In-1. Ax = 0 In-2. The values of the nonbasic variables are within their bounds: ∀xj ∈ N . lj ≤ α(xj ) ≤ uj . (5.11) Clearly, these invariants hold initially since all the variables in x are set to 0, and the nonbasic variables have no bounds. The main loop of the algorithm checks if there exists a basic variable that violates its bounds. If there is no such variable, then both the basic and nonbasic variables satisfy their bounds. Owing to invariant In-1, this means that the current assignment α satisfies (5.8), and the algorithm returns “Satisfiable”. Otherwise, let xi be a basic variable that violates its bounds, and assume, without loss of generality, that α(xi ) > ui , i.e., the upper bound of xi is violated. How do we change the assignment to xi so it satisfies its bounds? We need to find a way to reduce the value of xi . Recall how this value is specified: 118 5 Linear Arithmetic xi = ' aij xj . (5.12) xj ∈N The value of xi can be reduced by decreasing the value of a nonbasic variable xj such that aij > 0 and its current assignment is higher than its lower bound lj , or by increasing the value of a variable xj such that aij < 0 and its current assignment is lower than its upper bound uj . A variable xj fulfilling one of these conditions is said to be suitable. If there are no suitable variables, then ! # the problem is unsatisfiable and the algorithm terminates. Let θ denote by how much we have to increase (or decrease) α(xj ) in order "θ $ to meet xi ’s upper bound: . ui − α(xi ) . θ= aij (5.13) Increasing (or decreasing) xj by θ puts xi within its bounds. On the other hand xj does not necessarily satisfy its bounds anymore, and hence may violate the invariant In-2. We therefore swap xi and xj in the tableau, i.e., make xi nonbasic and xj basic. This requires a transformation of the tableau, which is called the pivot operation. The pivot operation is repeated until either a satisfying assignment is found, or the system is determined to be unsatisfiable. The Pivot Operation Suppose we want to swap xi with xj . We will need the following definition: Definition 5.5 (pivot element, column and row). Given two variables xi and xj , the coefficient aij is called the pivot element. The column of xj is called the pivot column. The row i is called the pivot row. A precondition for swapping two variables xi and xj is that their pivot element is nonzero, i.e., aij += 0. The pivot operation (or pivoting) is performed as follows: 1. Solve row i for xj . 2. For all rows l += i, eliminate xj by using the equality for xj obtained from row i. The reader may observe that the pivot operation is also the basic operation in the well-known Gaussian variable elimination procedure. Example 5.6. We continue our running example. As described above, we initialize α(xi ) = 0. This corresponds to point (A) in Fig. 5.1. Recall the tableau and the bounds: x y s1 1 s2 2 −1 s3 −1 1 2 2 ≤ s1 0 ≤ s2 1 ≤ s3 5.2 The Simplex Algorithm 119 The lower bound of s1 is 2, which is violated. The nonbasic variable that is the lowest in the ordering is x. The variable x has a positive coefficient, but no upper bound, and is therefore suitable for the pivot operation. We need to increase s1 by 2 in order to meet the lower bound, which means that x has to increase by 2 as well (θ = 2). The first step of the pivot operation is to solve the row of s1 for x: s1 = x + y ⇐⇒ x = s1 − y . (5.14) This equality is now used to replace x in the other two rows: s2 = 2(s1 − y) − y ⇐⇒ s2 = 2s1 − 3y (5.15) s3 = −(s1 − y) + 2y ⇐⇒ s3 = −s1 + 3y (5.16) Written as a tableau, the result of the pivot operation is: s1 x s2 y α(x) α(y) α(s1 ) α(s2 ) α(s3 ) 1 −1 2 −3 s3 −1 3 = 2 = 0 = 2 = 4 = −2 This new state corresponds to point (B) in Fig. 5.1. The lower bound of s3 is violated; this is the next basic variable that is selected. The only suitable variable for pivoting is y. We need to add 3 to s3 in order to meet the lower bound. This translates into θ= 1 − (−2) =1. 3 (5.17) After performing the pivot operation with s3 and y, the final tableau is: s1 s3 x 2/3 −1/3 s2 1 y 1/3 −1 1/3 α(x) α(y) α(s1 ) α(s2 ) α(s3 ) =1 =1 =2 =1 =1 This assignment α satisfies the bounds, and thus {x .→ 1, y .→ 1} is a satisfying assignment. It corresponds to point (C) in Fig. 5.1. Selecting the pivot element according to a fixed ordering for the basic and nonbasic variable ensures that no set of basic variables is ever repeated, and hence guarantees termination (no cycling can occur). For a detailed proof see [71]. This way of selecting a pivot element is called Bland’s rule. 120 5 Linear Arithmetic 5.2.4 Incremental Problems Decision problems are often constructed in an incremental manner, that is, the formula is strengthened with additional conjuncts. This can make a once satisfiable formula unsatisfiable. One scenario in which an incremental decision procedure is useful is the DPLL(T) framework, which we study in Chap. 11. The general simplex algorithm is well-suited for incremental problems. First, notice that any constraint can be disabled by removing its corresponding upper and lower bounds. The equality in the tableau is afterwards redundant, but will not render a satisfiable formula unsatisfiable. Second, the pivot operation performed on the tableau is an equivalence transformation, i.e., it preserves the set of solutions. We can therefore start the procedure with the tableau we have obtained from the previous set of bounds. The addition of upper and lower bounds is implemented as follows: • • If a bound for a nonbasic variable was added, update the values of the nonbasic variables according to the tableau to restore In-2. Call Algorithm 5.2.1 to determine if the new problem is satisfiable. Start with step 5. Furthermore, it is often desirable to remove constraints after they have been added. This is also relevant in the context of DPLL(T) because this algorithm activates and deactivates constraints. Normally constraints (or rather bounds) are removed when the current set of constraints is unsatisfiable. After removing a constraint the assignment has to be restored to a point at which it satisfied the two invariants of the general simplex algorithm. This can be done by simply restoring the assignment α to the last known satisfying assignment. There is no need to modify the tableau. 5.3 The Branch and Bound Method Branch and Bound is a widely used method for solving integer linear programs. As in the case of the simplex algorithm, Branch and Bound was developed for solving the optimization problem, but the description here focuses on an adaptation of this algorithm to the decision problem. The integer linear systems considered here have the same form as described in Sect. 5.2, with the additional requirement that the value of any variable in a satisfying assignment must be drawn from the set of integers. Observe that it is easy to support strict inequalities simply by adding 1 to or subtracting 1 from the constant on the right-hand side. Definition 5.7 (relaxed problem). Given an integer linear system S, its relaxation is S without the integrality requirement (i.e., the variables are not required to be integer). 5.3 The Branch and Bound Method 121 We denote the relaxed problem of S by relaxed(S). Assume the existence of a procedure LPfeasible , which receives a linear system S as input, and returns “Unsatisfiable” if S is unsatisfiable and a satisfying assignment otherwise. LPfeasible can be implemented with, for example, a variation of GeneralSimplex (Algorithm 5.2.1) that outputs a satisfying assignment if S is satisfiable. Using these notions, Algorithm 5.3.1 decides an integer linear system of constraints (recall that only conjunctions of constraints are considered here). ! Algorithm 5.3.1: Feasibility-Branch-and-Bound # Input: An integer linear system S Output: “Satisfiable” if S is satisfiable, “Unsatisfiable” otherwise 1. procedure Search-integral-solution(S) 2. res = LPfeasible (relaxed(S)); 3. if res = “Unsatisfiable” then return ; ! prune branch 4. else 5. if res is integral then ! integer solution found 6. 7. 8. 9. 10. abort(“Satisfiable”); else Select a variable v that is assigned a nonintegral value r; Search-integral-solution (S ∪ (v ≤ /r0)); Search-integral-solution (S ∪ (v ≥ 1r2)); ! no integer solution in this branch 11. procedure Feasibility-Branch-and-Bound(S) 12. Search-integral-solution(S); 13. return (“Unsatisfiable”); " $ The idea of the algorithm is simple: it solves the relaxed problem with LPfeasible ; if the relaxed problem is unsatisfiable, it backtracks because there is also no integer solution in this branch. If, on the other hand, the relaxed problem is satisfiable and the solution returned by LPfeasible happens to be integral, it terminates – a satisfying integral solution has been found. Otherwise, the problem is split into two subproblems, which are then processed with a recursive call. The nature of this split is best illustrated by an example. Example 5.8. Let x1 , . . . , x4 be the variables of S. Assume that LPfeasible returns the solution (1, 0.7, 2.5, 3) (5.18) in line 2. In line 7, Search-integral-solution chooses between x2 and x3 , which are the variables that were assigned a nonintegral value. Suppose that 122 5 Linear Arithmetic x2 is chosen. In line 8, S (the linear system solved at the current recursion level) is then augmented with the constraint x2 ≤ 0 (5.19) and sent for solving at a deeper recursion level. If no solution is found in this branch, S is augmented instead with x2 ≥ 1 (5.20) and, once again, is sent to a deeper recursion level. If both these calls return, this implies that S has no satisfying solution, and hence the procedure returns (backtracks). Note that returning from the initial recursion level causes the calling function Feasibility-Branch-and-Bound to return “Unsatisfiable”. Algorithm 5.3.1 is not complete: there are cases for which it will branch forever. As noted in [71], the system 1 ≤ 3x − 3y ≤ 2, for example, has no integer solutions but unbounded real solutions, and causes the basic Branch and Bound algorithm to loop forever. In order to make the algorithm complete, it is necessary to rely on the small-model property that such formulas have (we used this property earlier in Sect. 4.5). Recall that this means that if there is a satisfying solution, then there is also such a solution within a finite bound, which, for this theory, is also computable. This means that once we have computed this bound on the domain of each variable, we can stop searching for a solution once we have passed it. A detailed study of this bound in the context of optimization problems can be found in [139]. The same bounds are applicable to the feasibility problem as well. Briefly, it was shown in [139] that given an integer linear system S with an M × N coefficient matrix A, then if there is a solution to S, then one of the extreme points of the convex hull of S is also a solution, and any such solution x0 is bounded as follows: x0j ≤ ((M + N ) · N · θ)N for j = 1, . . . , N , (5.21) where θ is the maximal element in the coefficient matrix A or in the vector b. Thus, (5.21) gives us a bound on each of the N variables, which, by adding it as an explicit constraint, forces termination. Finally, let us mention that Branch and Bound can be extended in a straightforward way to handle the case in which some of the variables are integers while the others are real. In the context of optimization problems, this problem is known by the name mixed integer programming. 5.3.1 Cutting-Planes Cutting-planes are constraints that are added to a linear system that remove only noninteger solutions; that is, all satisfying integer solutions, if they exist, 5.3 The Branch and Bound Method 123 Aside: Branch and Bound for Integer Linear Programs When Branch and Bound is used for solving an optimization problem, it becomes somewhat more complicated. In particular, there are various pruning rules based on the value of the current objective function (a branch is pruned if it is identified that it cannot contain a solution better than what is already at hand from another branch). There are also various heuristics for choosing the variable on which to split and the first branch to be explored. satisfying assignments Fig. 5.2. The dots represent integer solutions. The thin dotted line represents a cutting-plane – a constraint that does not remove any integral solution remain satisfying, as demonstrated in Fig. 5.2. These new constraints improve the tightness of the relaxation in the process of solving integer linear systems. Here, we describe a family of cutting planes called Gomory cuts. We first illustrate this technique with an example, and then generalize it. Suppose that our problem includes the integer variables x1 , . . . , x3 , and the lower bounds 1 ≤ x1 and 0.5 ≤ x2 . Further, suppose that the final tableau of the general simplex algorithm includes the constraint x3 = 0.5x1 + 2.5x2 , (5.22) and that the solution α is {x3 .→ 1.75, x1 .→ 1, x2 .→ 0.5}, which, of course, satisfies (5.22). Subtracting these values from (5.22) gives us x3 − 1.75 = 0.5(x1 − 1) + 2.5(x2 − 0.5) . (5.23) We now wish to rewrite this equation so the left-hand side is an integer: x3 − 1 = 0.75 + 0.5(x1 − 1) + 2.5(x2 − 0.5) . (5.24) 124 5 Linear Arithmetic The two right-most terms must be positive because 1 and 0.5 are the lower bounds of x1 and x2 , respectively. Since the right-hand side must add up to an integer as well, this implies that 0.75 + 0.5(x1 − 1) + 2.5(x2 − 0.5) ≥ 1 . (5.25) Note, however, that this constraint is unsatisfied by α since by construction all the elements on the left other than the fraction 0.75 are equal to zero under α. This means that adding this constraint to the relaxed system will rule out this solution. On the other hand since it is implied by the integer system of constraints, it cannot remove any integer solution. Let us generalize this example into a recipe for generating such cutting planes. The generalization refers also to the case of having variables assigned their upper bounds, and both negative and positive coefficients. In order to derive a Gomory cut from a constraint, the constraint has to satisfy two conditions: First, the assignment to the basic variable has to be fractional; Second, the assignments to all the nonbasic variables have to correspond to one of their bounds. The following recipe, which relies on these conditions, is based on a report by Dutertre and de Moura [71]. Consider the i-th constraint: ' aij xj , (5.26) xi = xj ∈N where xi ∈ B. Let α be the assignment returned by the general simplex algorithm. Thus, ' α(xi ) = aij α(xj ) . (5.27) xj ∈N We now partition the nonbasic variables to those that are currently assigned their lower bound and those that are currently assigned their upper bound: J = {j | xj ∈ N ∧ α(xj ) = lj } K = {j | xj ∈ N ∧ α(xj ) = uj } . (5.28) Subtracting (5.27) from (5.26) taking the partition into account yields ' ' aij (xj − lj ) − aij (uj − xj ) . (5.29) xi − α(xi ) = j∈J j∈K Let f0 = α(xi ) − /α(xi )0. Since we assumed that α(xi ) is not an integer then 0 < f0 < 1. We can now rewrite (5.29) as ' ' aij (xj − lj ) − aij (uj − xj ) . (5.30) xi − /α(xi )0 = f0 + j∈J j∈K Note that the left-hand side is an integer. We now consider two cases. 5.3 The Branch and Bound Method • 125 ) ) If j∈J aij (xj − lj ) − j∈K aij (uj − xj ) > 0 then, since the right-hand side must be an integer, ' ' aij (xj − lj ) − aij (uj − xj ) ≥ 1 . (5.31) f0 + j∈J j∈K We now split J and K as follows: J+ J− K+ J− = {j = {j = {j = {j |j |j |j |j ∈ J ∧ aij > 0} ∈ J ∧ aij < 0} ∈ K ∧ aij > 0} ∈ K ∧ aij < 0} (5.32) Gathering only the positive elements in the left-hand side of (5.31) gives us: ' ' aij (xj − lj ) − aij (uj − xj ) ≥ 1 − f0 , (5.33) j∈J + or, equivalently, ' j∈J + • j∈K − ' aij aij (xj − lj ) − (uj − xj ) ≥ 1 . 1 − f0 1 − f0 − (5.34) j∈K ) ) If j∈J aij (xj − lj ) − j∈K aij (uj − xj ) ≤ 0 then again, since the righthand side must be an integer, ' ' aij (xj − lj ) − aij (uj − xj ) ≤ 0 . (5.35) f0 + j∈J j∈K Eq. (5.35) implies that ' ' aij (xj − lj ) − aij (uj − xj ) ≤ −f0 . j∈J − Dividing by −f0 gives us ' aij ' aij (xj − lj ) + (uj − xj ) ≥ 1 . − f0 f0 − + j∈J (5.36) j∈K + (5.37) j∈K Note that the left-hand side of both (5.34) and (5.37) is greater than zero. Therefore these two equations imply ' aij ' aij (xj − lj ) − (xj − lj ) 1 − f0 f0 j∈J + j∈J − ' aij ' aij (uj − xj ) − (uj − xj ) ≥ 1 . (5.38) + f0 1 − f0 + − j∈K j∈K Since each of the elements on the left-hand side is equal to zero under the current assignment α, this assignment α is ruled out by the new constraint. In other words, the solution to the linear problem augmented with the constraint is guaranteed to be different from the previous one. 126 5 Linear Arithmetic 5.4 Fourier–Motzkin Variable Elimination 5.4.1 Equality Constraints Similarly to the simplex method, the Fourier–Motzkin variable elimination algorithm takes a conjunction of linear constraints over real variables. Let m denote the number of such constraints, and let x1 , . . . , xn denote the variables used by these constraints. As a first step, equality constraints of the following form are eliminated: n ' j=1 ai,j · xj = bi . (5.39) We choose a variable xj that has a nonzero coefficient ai,j in an equality constraint i. Without loss of generality, we assume that xn is the variable that is to be eliminated. The constraint (5.39) can be rewritten as n−1 xn = ' ai,j bi − · xj . ai,n j=1 ai,n (5.40) Now we substitute the right-hand side of (5.40) for xn into all the other constraints, and remove constraint i. This is iterated until all equalities are removed. We are left with a system of inequalities of the form m ' n ! i=1 j=1 ai,j xj ≤ bi . (5.41) 5.4.2 Variable Elimination The basic idea of the variable elimination algorithm is to heuristically choose a variable and then to eliminate it by projecting its constraints onto the rest of the system, resulting in new constraints. Example 5.9. Consider Fig. 5.3(a): the constraints 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 3 ≤z≤1 4 (5.42) form a cuboid. Projecting these constraints onto the x and y axes, and thereby eliminating z, results in a square which is given by the constraints 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 . Figure 5.3(b) shows a triangle formed by the constraints (5.43) 5.4 Fourier–Motzkin Variable Elimination 127 1 y y 17.5 15 12.5 10 7.5 5 2.5 0 z 0 0 x y ! 15 y # 20 $ x 5 1 10 x ! 10 " y 15 20 25 x (b) (a) Fig. 5.3. Projection of constraints: (a) a cuboid is projected onto the x and y axes; (b) a triangle is projected onto the x axis x ≤ y + 10, y ≤ 15, y ≥ −x + 20 . (5.44) The projection of the triangle onto the x axis is a line given by the constraints 5 ≤ x ≤ 25 . (5.45) Thus, the projection forms a new problem with one variable fewer, but possibly more constraints. This is done iteratively until all variables but one have been eliminated. The problem with one variable is trivially decidable. The order in which the variables are eliminated may be predetermined, or adjusted dynamically to the current set of constraints. There are various heuristics for choosing the elimination order. A standard greedy heuristic gives priority to variables that produce fewer new constraints when eliminated. Once again, assume that xn is the variable chosen to be eliminated. The constraints are partitioned according to the coefficient of xn . Consider the constraint with index i: n ' ai,j · xj ≤ bi . (5.46) j=1 By splitting the sum, (5.46) can be rewritten into ai,n · xn ≤ bi − n−1 ' j=1 ai,j · xj . (5.47) If ai,n is zero, the constraint can be disregarded when we are eliminating xn . Otherwise, we divide by ai,n . If ai,n is positive, we obtain n−1 xn ≤ ' ai,j bi − · xj . ai,n j=1 ai,n (5.48) 128 5 Linear Arithmetic ! # Thus, if ai,n > 0, the constraint is an upper bound on xn . If ai,n < 0, the β constraint is a lower bound. We denote the right-hand side of (5.48) by βi . " i$ Unbounded Variables It is possible that a variable is not bounded both ways, i.e., it has either only upper bounds or only lower bounds. Such variables are called unbounded variables. Unbounded variables can be simply removed from the system together with all constraints that use them. Removing these constraints can make other variables unbounded. Thus, this simplification stage iterates until no such variables remain. Bounded Variables If xn has both an upper and a lower bound, the algorithm enumerates all pairs of lower and upper bounds. Let u ∈ {1, . . . , m} denote the index of an upper-bound constraint, and l ∈ {1, . . . , m} denote the index of a lower-bound constraint for xn , where l += u. For each such pair, we have βl ≤ xn ≤ βu . (5.49) The following new constraint is added: βl ≤ βu . (5.50) The Formula (5.50) may simplify to 0 ≤ bk , where bk is some constant smaller than 0. In this case, the algorithm has found a conflicting pair of constraints and concludes that the problem is unsatisfiable. Otherwise, all constraints that involve xn are removed. The new problem is solved recursively as before. Example 5.10. Consider the following set of constraints: x1 −x2 x1 −x3 −x1 +x2 +2x3 −x3 ≤ 0 ≤ 0 ≤ 0 ≤ −1 . (5.51) Suppose we decide to eliminate the variable x1 first. There are two upper bounds on x1 , namely x1 ≤ x2 and x1 ≤ x3 , and one lower bound, which is x2 + 2x3 ≤ x1 . Using x1 ≤ x2 as the upper bound, we obtain a new constraint 2x3 ≤ 0, and using x1 ≤ x3 as the upper bound, we obtain a new constraint x2 +x3 ≤ 0. Constraints involving x1 are removed from the problem, which results in the following new set: 2x3 ≤ 0 x2 +x3 ≤ 0 (5.52) −x3 ≤ −1 . 5.5 The Omega Test 129 Next, observe that x2 is unbounded (as it has no lower bound), and hence the second constraint can be eliminated, which simplifies the formula. We therefore progress by eliminating x2 and all the constraints that contain it: 2x3 ≤ 0 −x3 ≤ −1 . (5.53) Only the variable x3 remains, with a lower and an upper bound. Combining the two into a new constraint results in 1 ≤ 0, which is a contradiction. Thus, the system is unsatisfiable. The simplex method in its basic form, as described in Sect. 5.2, allows only nonstrict (≤) inequalities.4 The Fourier–Motzkin method, on the other hand, can easily be extended to handle a combination of strict (<) and nonstrict inequalities: if either the lower or the upper bound is a strict inequality, then so is the resulting constraint. 5.4.3 Complexity In each iteration, the number of constraints can increase in the worst case n from m to m2 /4, which results overall in m2 /4n constraints. Thus, Fourier– Motzkin variable elimination is only suitable for a relatively small set of constraints and a small number of variables. 5.5 The Omega Test 5.5.1 Problem Description The Omega test is an algorithm to decide the satisfiability of a conjunction of linear constraints over integer variables. Each conjunct is assumed to be either an equality of the form n ' ai xi = b (5.54) ai xi ≤ b . (5.55) i=1 or a nonstrict inequality of the form n ' i=1 The coefficients ai are assumed to be integers; if they are not, by making use of the assumption that the coefficients are rational, the problem can be transformed into one with integer coefficients by multiplying the constraints 4 There are extensions of Simplex to strict inequalities. See, for example, [70]. 130 5 Linear Arithmetic by the least common multiple of the denominators. In Sect. 5.6, we show how strict inequalities can be transformed into nonstrict inequalities. The runtime of the Omega test depends on the size of the coefficients ai . It is therefore desirable to transform the constraints such that small coefficients are obtained. This can be done by dividing the coefficients a1 , . . . , an of each constraint by their greatest common divisor g. The resulting constraint is called normalized. If the constraint is an equality constraint, this results in n ' ai i=1 g xi = b . g (5.56) If g does not divide b exactly, the system is unsatisfiable. If the constraint is an inequality, one can tighten the constraint by rounding down the constant: * + n ' b ai xi ≤ . (5.57) g g i=1 More simplifications of this kind are described in Sect. 5.6. Example 5.11. The equality 3x + 3y = 2 can be normalized to x + y = 2/3, which is unsatisfiable. The constraint 8x + 6y ≤ 0 can be normalized to obtain 4x + 3y ≤ 0. The constraint 1 ≤ 4y can be tightened to obtain 1 ≤ y. The Omega test is a variant of the Fourier–Motzkin variable elimination algorithm (Sect. 5.4). As in the case of that algorithm, equality and inequality constraints are treated separately; all equality constraints are removed before inequalities are considered. 5.5.2 Equality Constraints In order to eliminate an equality of the form of (5.54), we first check if there is a variable xj with a coefficient 1 or −1, i.e., |aj | = 1. If yes, we transform the constraint as follows. Without loss of generality, assume j = n. We isolate xn : n−1 ' ai b − xi . (5.58) xn = an a i=1 n The variable xn can now be substituted by the right-hand side of (5.58) in all constraints. If there is no variable with a coefficient 1 or −1, we cannot simply divide by the coefficient, as this would result in nonintegral coefficients. Instead, the algorithm proceeds as follows: it determines the variable that has the nonzero coefficient with the smallest absolute value. Assume again that xn is chosen, and that an > 0. The Omega test transforms the constraints iteratively until some coefficient becomes 1 or −1. The variable with that coefficient can then be eliminated as above. 5.5 The Omega Test 131 , , called symmetric For this transformation, a new binary operator mod modulo, is defined as follows: * + a 1 . , b= a mod a−b· + . (5.59) b 2 The symmetric modulo operator is very similar to the usual modular arith, b = a mod b. If a mod b is greater metic operator. If a mod b < b/2, then a mod than or equal to b/2, b is deducted, and thus : a mod b < b/2 , b = a mod b a mod (5.60) (a mod b) − b : otherwise . We leave the proof of this equivalence as an exercise (see Problem 5.12). Our goal is to derive a term that can replace xn . For this purpose, we . define m = an + 1, introduce a new variable σ, and add the following new constraint: n ' , m. , m) xi = mσ + b mod (ai mod (5.61) i=1 We split the sum on the left-hand side to obtain , m− , m)xn = mσ + b mod (an mod n−1 ' i=1 , m) xi . (ai mod , m = −1 (see Problem 5.14), this simplifies to: Since an mod , m+ xn = −mσ − b mod n−1 ' i=1 , m) xi . (ai mod (5.62) (5.63) The right-hand side of (5.63) is used to replace xn in all constraints. Any equality from the original problem (5.54) is changed as follows: . / n−1 n−1 ' ' , , ai xi + an −mσ − b mod m + (ai mod m) xi = b , (5.64) i=1 i=1 which can be rewritten as −an mσ + n−1 ' i=1 , m) . , m)) xi = b + an (b mod (ai + an (ai mod (5.65) Since an = m − 1, this simplifies to )n−1 , m)) + m(ai mod , m)) xi = −an mσ + i=1 ((ai − (ai mod , , b − (b mod m) + m(b mod m) . (5.66) # ! , b a mod " $ 132 5 Linear Arithmetic , m) is equal to m/ai/m + 1/20, and thus all terms are Note that ai − (ai mod divisible by m. Dividing (5.66) by m results in −an σ + n−1 ' i=1 , m) . (5.67) , m)) xi = /b/m + 1/20 + (b mod (/ai/m + 1/20 + (ai mod The absolute value of the coefficient of σ is the same as the absolute value of the original coefficient an , and it seems that nothing has been gained by this substitution. However, observe that the coefficient of xi can be bounded as follows (see Problem 5.13): , m)| ≤ 5 |ai | . |/ai/m + 1/20 + (ai mod 6 (5.68) Thus, the absolute values of the coefficients in the equality are strictly smaller than their previous values. As the coefficients are always integral, repeated application of equality elimination eventually generates a coefficient of 1 or −1 on some variable. This variable can then be eliminated directly, as described earlier (see (5.58)). Example 5.12. Consider the following formula: −3x1 +2x2 = 0 3x1 +4x2 = 3 . (5.69) The variable x2 has the coefficient with the smallest absolute value (a2 = 2). Thus, m = a2 + 1 = 3, and we add the following constraint (see (5.61)): , 3)x1 + (2 mod , 3)x2 = 3σ . (−3 mod (5.70) This simplifies to x2 = −3σ. Substituting −3σ for x2 results in the following problem: −3x1 −6σ = 0 (5.71) 3x1 −12σ = 3 . Division by m results in −x1 −2σ = 0 x1 −4σ = 1 . (5.72) As expected, the coefficient of x1 has decreased. We can now substitute x1 by 4σ + 1, and obtain −6σ = 1, which is unsatisfiable. 5.5.3 Inequality Constraints Once all equalities have been eliminated, the algorithm attempts to find a solution for the remaining inequalities. The control flow of Algorithm 5.5.1 is illustrated in Fig. 5.4. As in the Fourier–Motzkin procedure, the first step is to choose a variable to be eliminated. Subsequently, the three subprocedures 5.5 The Omega Test 133 Real-Shadow, Dark-Shadow, and Gray-Shadow produce new constraint sets, which are solved recursively. Note that many of the subproblems generated by the recursion are actually identical. An efficient implementation uses a hash table that stores the solutions of previously solved problems. ! # Algorithm 5.5.1: Omega-Test Input: A conjunction of constraints C Output: “Satisfiable” if C is satisfiable, and “Unsatisfiable” otherwise 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. " if C only contains one variable then Solve and return result; ! (solving this problem is trivial) Otherwise, choose a variable v that occurs in C; CR := Real-Shadow(C, v); if Omega-Test(CR ) = “Unsatisfiable” then return “Unsatisfiable”; CD := Dark-Shadow(C, v); if Omega-Test(CD ) = “Satisfiable” then return “Satisfiable”; if CR = CD then return “Unsatisfiable”; 1 n , . . . , CG := Gray-Shadow(C, v); CG for all i ∈ {1, . . . , n} do i ) = “Satisfiable” then if Omega-Test(CG return “Satisfiable”; return “Unsatisfiable”; ! Recursive call ! Recursive call ! Exact projection? ! Recursive call $ Checking the Real Shadow Even though the Omega test is concerned with constraints over integers, the first step is to check if there are integer solutions in the relaxed problem, which is called the real shadow. The real shadow is the same projection that the Fourier–Motzkin procedure uses. The Omega test is then called recursively to check if the projection contains an integer. If there is no such integer, then there is no integer solution to the original system either, and the algorithm concludes that the system is unsatisfiable. 134 5 Linear Arithmetic No integer solution Check real shadow UNSAT Possible integer solution Integer solution Check dark shadow SAT No integer solution in DARK shadow Integer solution Check gray shadow SAT No integer solution UNSAT Fig. 5.4. Overview of the Omega test Assume that the variable to be eliminated is denoted by z. As in the case of the Fourier–Motzkin procedure, all pairs of lower and upper bounds have to be considered. Variables that are not bounded both ways can be removed, together with all constraints that contain them. Let β ≤ bz and cz ≤ γ be constraints, where c and b are positive integer constants and γ and β denote the remaining linear expressions. Consequently, β/b is a lower bound on z, and γ/c is an upper bound on z. The new constraint is obtained by multiplying the lower bound by c and the upper bound by b: Lower bound Upper bound β ≤ bz cβ ≤ cbz cz ≤ γ cbz ≤ bγ (5.73) The existence of such a variable z implies cβ ≤ bγ . (5.74) 2y ≤ x 8y ≥ 2 +x 2y ≤ 3 −x . (5.75) Example 5.13. Consider the following set of constraints: The triangle spanned by these constraints is depicted in Fig. 5.5. Assume that we decide to eliminate x. In this case, the combination of the two constraints 5.5 The Omega Test 135 2y ≤ x and 8y ≥ 2 + x results in 8y − 2 ≥ 2y, which simplifies to y ≥ 1/3. The two constraints 2y ≤ x and 2y ≤ 3 − x combine into 2y ≤ 3 − 2y, which simplifies to y ≤ 3/4. Thus, 1/3 ≤ y ≤ 3/4 must hold, which has no integer solution. The set of constraints is therefore unsatisfiable. 1.5 1.25 1 0.75 0.5 0.25 y 2y† x 2y†3x 8y– 2x 0.5 0.25 1 1.5 2 2.5 3 x Fig. 5.5. Computing the real shadow: eliminating x The converse of this observation does not hold, i.e., if we find an integer solution within the real shadow, this does not guarantee that the original set of constraints has an integer solution. This is illustrated by the following example. 1.5 1.25 1 0.75 0.5 0.25 0.25 y 2y† x 2y† 3x 8y– 2x 0.5 1 1.5 2 2.5 3 x Fig. 5.6. Computing the real shadow: eliminating y Example 5.14. Consider the same set of constraints as in Example 5.13. This time, eliminate y instead of x. This projection is depicted in Fig. 5.6. 136 5 Linear Arithmetic We obtain 2/3 ≤ x ≤ 2, which has two integer solutions. The triangle, on the other hand, contains no integer solution. The real shadow is an overapproximating projection, as it contains more solutions than does the original problem. The next step in the Omega test is to compute an underapproximating projection, i.e., if that projection contains an integer solution, so does the original problem. This projection is called the dark shadow. Checking the Dark Shadow The name dark shadow is motivated by optics. Assume that the object we are projecting is partially translucent. Places that are “thicker” will project a darker shadow. In particular, a dark area in the shadow where the object is thicker than 1 must have at least one integer above it. After the first phase of the algorithm, we know that there is a solution to the real shadow, i.e., cβ ≤ bγ. We now aim at determining if there is an integer z such that cβ ≤ cbz ≤ bγ, which is equivalent to ∃z ∈ Z. γ β ≤z≤ . b c (5.76) Assume that (5.76) does not hold. Let i denote /β/b0, i.e., the largest integer that is smaller than β/b. Since we have assumed that there is no integer between β/b and γ/c, γ β (5.77) i< ≤ 0 (5.93) aj xj ≤ b ' j|aj <0 (5.94) aj lj ≤ b . To put this in words, a “≤” constraint in the above form is redundant if assigning values equal to their upper bounds to all of its variables that have a positive coefficient, and assigning values equal to their lower bounds to all of its variables that have a negative coefficient, results in a value less than or equal to b, the constant on the right-hand side of the inequality. 2. Consider the following set of constraints: 2x1 + x2 ≤ 2, x2 ≥ 4, x1 ≤ 3 . (5.95) From the first and second constraints, x1 ≤ −1 can be derived, which means that the bound x1 ≤ 3 can be tightened. In general, if a0 > 0, then   ' ' x0 ≤ b − aj lj − aj uj  /a0 , (5.96) j|j>0,aj >0 and if a0 < 0, then  x0 ≥ b − ' j|aj >0 aj lj − j|aj <0 ' j|j>0,aj <0  aj uj  /a0 . (5.97) 5.6.2 Preprocessing of Integer Linear Systems The following preprocessing steps are applicable to integer linear systems: 1. Multiply every constraint by the smallest common multiple of the coefficients and constants in this constraint, in order to obtain a system with integer coefficients.5 2. After the previous preprocessing has been applied, strict inequalities can be transformed into nonstrict inequalities as follows: ' ai xi < b (5.98) 1≤i≤n is replaced with ' 1≤i≤n ai xi ≤ b − 1 . (5.99) The case in which b is fractional is handled by the previous preprocessing step. 5 This assumes that the coefficients and constants in the system are rational. The case in which the coefficients can be nonrational is of little value and is rarely considered in the literature. 140 5 Linear Arithmetic For the special case of 0–1 linear systems (integer linear systems in which all the variables are constrained to be either 0 or 1), some preprocessing steps are illustrated by the following examples: 1. Consider the constraint 5x1 − 3x2 ≤ 4 , (5.100) x1 = 1 =⇒ x2 = 1 . (5.101) x1 ≤ x2 (5.102) x1 + x2 ≤ 1, x2 ≥ 1 , (5.103) from which we can conclude that Hence, the constraint can be added. 2. From we can conclude x1 = 0. Generalization of these examples is left for Problem 5.8. 5.7 Difference Logic 5.7.1 Introduction A popular fragment of linear arithmetic is called difference logic. Definition 5.15 (difference logic). The syntax of a formula in difference logic is defined by the following rules: formula : formula ∧ formula | atom atom : identifier − identifier op constant op : ≤ | < Here, we consider the case in which the variables are defined over Q, the rationals. A similar definition exists for the case in which the variables are defined over Z (see Problem 5.18). Solving both variants is polynomial, whereas, recall, linear arithmetic over Z is NP-complete. Some other convenient operands can be modeled with the grammar above: • • • • x − y = c is the same as x − y ≤ c ∧ y − x ≤ −c. x − y ≥ c is the same as y − x ≤ −c. x − y > c is the same as y − x < −c. A constraint with one variable such as x < 5 can be rewritten as x−x0 < 5, where x0 is a special variable not used so far in the formula, called the “zero variable”. In any satisfying assignment, its value must be 0. 5.7 Difference Logic As an example, 141 x