llvm/mlir/include/mlir/Analysis/Presburger/Simplex.h

//===- Simplex.h - MLIR Simplex Class ---------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// Functionality to perform analysis on an IntegerRelation. In particular,
// support for performing emptiness checks, redundancy checks and obtaining the
// lexicographically minimum rational element in a set.
//
//===----------------------------------------------------------------------===//

#ifndef MLIR_ANALYSIS_PRESBURGER_SIMPLEX_H
#define MLIR_ANALYSIS_PRESBURGER_SIMPLEX_H

#include "mlir/Analysis/Presburger/Fraction.h"
#include "mlir/Analysis/Presburger/IntegerRelation.h"
#include "mlir/Analysis/Presburger/Matrix.h"
#include "mlir/Analysis/Presburger/PWMAFunction.h"
#include "mlir/Analysis/Presburger/Utils.h"
#include "llvm/ADT/SmallBitVector.h"
#include <optional>

namespace mlir {
namespace presburger {

class GBRSimplex;

/// The Simplex class implements a version of the Simplex and Generalized Basis
/// Reduction algorithms, which can perform analysis of integer sets with affine
/// inequalities and equalities. A Simplex can be constructed
/// by specifying the dimensionality of the set. It supports adding affine
/// inequalities and equalities, and can perform emptiness checks, i.e., it can
/// find a solution to the set of constraints if one exists, or say that the
/// set is empty if no solution exists. Furthermore, it can find a subset of
/// these constraints that are redundant, i.e. a subset of constraints that
/// doesn't constrain the affine set further after adding the non-redundant
/// constraints. The LexSimplex class provides support for computing the
/// lexicographic minimum of an IntegerRelation. The SymbolicLexOpt class
/// provides support for computing symbolic lexicographic minimums. All of these
/// classes can be constructed from an IntegerRelation, and all inherit common
/// functionality from SimplexBase.
///
/// The implementations of the Simplex and SimplexBase classes, other than the
/// functionality for obtaining an integer sample, are based on the paper
/// "Simplify: A Theorem Prover for Program Checking"
/// by D. Detlefs, G. Nelson, J. B. Saxe.
///
/// We define variables, constraints, and unknowns. Consider the example of a
/// two-dimensional set defined by 1 + 2x + 3y >= 0 and 2x - 3y >= 0. Here,
/// x, y, are variables while 1 + 2x + 3y >= 0, 2x - 3y >= 0 are constraints.
/// Unknowns are either variables or constraints, i.e., x, y, 1 + 2x + 3y >= 0,
/// 2x - 3y >= 0 are all unknowns.
///
/// The implementation involves a matrix called a tableau, which can be thought
/// of as a 2D matrix of rational numbers having number of rows equal to the
/// number of constraints and number of columns equal to one plus the number of
/// variables. In our implementation, instead of storing rational numbers, we
/// store a common denominator for each row, so it is in fact a matrix of
/// integers with number of rows equal to number of constraints and number of
/// columns equal to _two_ plus the number of variables. For example, instead of
/// storing a row of three rationals [1/2, 2/3, 3], we would store [6, 3, 4, 18]
/// since 3/6 = 1/2, 4/6 = 2/3, and 18/6 = 3.
///
/// Every row and column except the first and second columns is associated with
/// an unknown and every unknown is associated with a row or column. An unknown
/// associated with a row or column is said to be in row or column orientation
/// respectively. As described above, the first column is the common
/// denominator. The second column represents the constant term, explained in
/// more detail below. These two are _fixed columns_; they always retain their
/// position as the first and second columns. Additionally, LexSimplexBase
/// stores a so-call big M parameter (explained below) in the third column, so
/// LexSimplexBase has three fixed columns. Finally, SymbolicLexSimplex has
/// `nSymbol` variables designated as symbols. These occupy the next `nSymbol`
/// columns, viz. the columns [3, 3 + nSymbol). For more information on symbols,
/// see LexSimplexBase and SymbolicLexSimplex.
///
/// LexSimplexBase does not directly support variables which can be negative, so
/// we introduce the so-called big M parameter, an artificial variable that is
/// considered to have an arbitrarily large value. We then transform the
/// variables, say x, y, z, ... to M, M + x, M + y, M + z. Since M has been
/// added to these variables, they are now known to have non-negative values.
/// For more details, see the documentation for LexSimplexBase. The big M
/// parameter is not considered a real unknown and is not stored in the `var`
/// data structure; rather the tableau just has an extra fixed column for it
/// just like the constant term.
///
/// The vectors var and con store information about the variables and
/// constraints respectively, namely, whether they are in row or column
/// position, which row or column they are associated with, and whether they
/// correspond to a variable or a constraint.
///
/// An unknown is addressed by its index. If the index i is non-negative, then
/// the variable var[i] is being addressed. If the index i is negative, then
/// the constraint con[~i] is being addressed. Effectively this maps
/// 0 -> var[0], 1 -> var[1], -1 -> con[0], -2 -> con[1], etc. rowUnknown[r] and
/// colUnknown[c] are the indexes of the unknowns associated with row r and
/// column c, respectively.
///
/// The unknowns in column position are together called the basis. Initially the
/// basis is the set of variables -- in our example above, the initial basis is
/// x, y.
///
/// The unknowns in row position are represented in terms of the basis unknowns.
/// If the basis unknowns are u_1, u_2, ... u_m, and a row in the tableau is
/// d, c, a_1, a_2, ... a_m, this represents the unknown for that row as
/// (c + a_1*u_1 + a_2*u_2 + ... + a_m*u_m)/d. In our running example, if the
/// basis is the initial basis of x, y, then the constraint 1 + 2x + 3y >= 0
/// would be represented by the row [1, 1, 2, 3].
///
/// The association of unknowns to rows and columns can be changed by a process
/// called pivoting, where a row unknown and a column unknown exchange places
/// and the remaining row variables' representation is changed accordingly
/// by eliminating the old column unknown in favour of the new column unknown.
/// If we had pivoted the column for x with the row for 2x - 3y >= 0,
/// the new row for x would be [2, 1, 3] since x = (1*(2x - 3y) + 3*y)/2.
/// See the documentation for the pivot member function for details.
///
/// The association of unknowns to rows and columns is called the _tableau
/// configuration_. The _sample value_ of an unknown in a particular tableau
/// configuration is its value if all the column unknowns were set to zero.
/// Concretely, for unknowns in column position the sample value is zero; when
/// the big M parameter is not used, for unknowns in row position the sample
/// value is the constant term divided by the common denominator. When the big M
/// parameter is used, if d is the denominator, p is the big M coefficient, and
/// c is the constant term, then the sample value is (p*M + c)/d. Since M is
/// considered to be positive infinity, this is positive (negative) infinity
/// when p is positive or negative, and c/d when p is zero.
///
/// The tableau configuration is called _consistent_ if the sample value of all
/// restricted unknowns is non-negative. Initially there are no constraints, and
/// the tableau is consistent. When a new constraint is added, its sample value
/// in the current tableau configuration may be negative. In that case, we try
/// to find a series of pivots to bring us to a consistent tableau
/// configuration, i.e. we try to make the new constraint's sample value
/// non-negative without making that of any other constraints negative. (See
/// findPivot and findPivotRow for details.) If this is not possible, then the
/// set of constraints is mutually contradictory and the tableau is marked
/// _empty_, which means the set of constraints has no solution.
///
/// This SimplexBase class also supports taking snapshots of the current state
/// and rolling back to prior snapshots. This works by maintaining an undo log
/// of operations. Snapshots are just pointers to a particular location in the
/// log, and rolling back to a snapshot is done by reverting each log entry's
/// operation from the end until we reach the snapshot's location. SimplexBase
/// also supports taking a snapshot including the exact set of basis unknowns;
/// if this functionality is used, then on rolling back the exact basis will
/// also be restored. This is used by LexSimplexBase because the lex algorithm,
/// unlike `Simplex`, is sensitive to the exact basis used at a point.
class SimplexBase {};

/// Simplex class using the lexicographic pivot rule. Used for lexicographic
/// optimization. The implementation of this class is based on the paper
/// "Parametric Integer Programming" by Paul Feautrier.
///
/// This does not directly support negative-valued variables, so it uses the big
/// M parameter trick to make all the variables non-negative. Basically we
/// introduce an artifical variable M that is considered to have a value of
/// +infinity and instead of the variables x, y, z, we internally use variables
/// M + x, M + y, M + z, which are now guaranteed to be non-negative. See the
/// documentation for SimplexBase for more details. M is also considered to be
/// an integer that is divisible by everything.
///
/// The whole algorithm is performed with M treated as a symbol;
/// it is just considered to be infinite throughout and it never appears in the
/// final outputs. We will deal with sample values throughout that may in
/// general be some affine expression involving M, like pM + q or aM + b. We can
/// compare these with each other. They have a total order:
///
/// aM + b < pM + q iff  a < p or (a == p and b < q).
/// In particular, aM + b < 0 iff a < 0 or (a == 0 and b < 0).
///
/// When performing symbolic optimization, sample values will be affine
/// expressions in M and the symbols. For example, we could have sample values
/// aM + bS + c and pM + qS + r, where S is a symbol. Now we have
/// aM + bS + c < pM + qS + r iff (a < p) or (a == p and bS + c < qS + r).
/// bS + c < qS + r can be always true, always false, or neither,
/// depending on the set of values S can take. The symbols are always stored
/// in columns [3, 3 + nSymbols). For more details, see the
/// documentation for SymbolicLexSimplex.
///
/// Initially all the constraints to be added are added as rows, with no attempt
/// to keep the tableau consistent. Pivots are only performed when some query
/// is made, such as a call to getRationalLexMin. Care is taken to always
/// maintain a lexicopositive basis transform, explained below.
///
/// Let the variables be x = (x_1, ... x_n).
/// Let the symbols be   s = (s_1, ... s_m). Let the basis unknowns at a
/// particular point be  y = (y_1, ... y_n). We know that x = A*y + T*s + b for
/// some n x n matrix A, n x m matrix s, and n x 1 column vector b. We want
/// every column in A to be lexicopositive, i.e., have at least one non-zero
/// element, with the first such element being positive. This property is
/// preserved throughout the operation of LexSimplexBase. Note that on
/// construction, the basis transform A is the identity matrix and so every
/// column is lexicopositive. Note that for LexSimplexBase, for the tableau to
/// be consistent we must have non-negative sample values not only for the
/// constraints but also for the variables. So if the tableau is consistent then
/// x >= 0 and y >= 0, by which we mean every element in these vectors is
/// non-negative. (note that this is a different concept from lexicopositivity!)
class LexSimplexBase : public SimplexBase {};

/// A class for lexicographic optimization without any symbols. This also
/// provides support for integer-exact redundancy and separateness checks.
class LexSimplex : public LexSimplexBase {};

/// Represents the result of a symbolic lexicographic optimization computation.
struct SymbolicLexOpt {};

/// A class to perform symbolic lexicographic optimization,
/// i.e., to find, for every assignment to the symbols the specified
/// `symbolDomain`, the lexicographically minimum value integer value attained
/// by the non-symbol variables.
///
/// The input is a set parametrized by some symbols, i.e., the constant terms
/// of the constraints in the set are affine expressions in the symbols, and
/// every assignment to the symbols defines a non-symbolic set.
///
/// Accordingly, the sample values of the rows in our tableau will be affine
/// expressions in the symbols, and every assignment to the symbols will define
/// a non-symbolic LexSimplex. We then run the algorithm of
/// LexSimplex::findIntegerLexMin simultaneously for every value of the symbols
/// in the domain.
///
/// Often, the pivot to be performed is the same for all values of the symbols,
/// in which case we just do it. For example, if the symbolic sample of a row is
/// negative for all values in the symbol domain, the row needs to be pivoted
/// irrespective of the precise value of the symbols. To answer queries like
/// "Is this symbolic sample always negative in the symbol domain?", we maintain
/// a `LexSimplex domainSimplex` correponding to the symbol domain.
///
/// In other cases, it may be that the symbolic sample is violated at some
/// values in the symbol domain and not violated at others. In this case,
/// the pivot to be performed does depend on the value of the symbols. We
/// handle this by splitting the symbol domain. We run the algorithm for the
/// case where the row isn't violated, and then come back and run the case
/// where it is.
class SymbolicLexSimplex : public LexSimplexBase {};

/// The Simplex class uses the Normal pivot rule and supports integer emptiness
/// checks as well as detecting redundancies.
///
/// The Simplex class supports redundancy checking via detectRedundant and
/// isMarkedRedundant. A redundant constraint is one which is never violated as
/// long as the other constraints are not violated, i.e., removing a redundant
/// constraint does not change the set of solutions to the constraints. As a
/// heuristic, constraints that have been marked redundant can be ignored for
/// most operations. Therefore, these constraints are kept in rows 0 to
/// nRedundant - 1, where nRedundant is a member variable that tracks the number
/// of constraints that have been marked redundant.
///
/// Finding an integer sample is done with the Generalized Basis Reduction
/// algorithm. See the documentation for findIntegerSample and reduceBasis.
class Simplex : public SimplexBase {};

/// Takes a snapshot of the simplex state on construction and rolls back to the
/// snapshot on destruction.
///
/// Useful for performing operations in a "transient context", all changes from
/// which get rolled back on scope exit.
class SimplexRollbackScopeExit {};

} // namespace presburger
} // namespace mlir

#endif // MLIR_ANALYSIS_PRESBURGER_SIMPLEX_H