Table of Contents
Introduction
EMP (Extended Mathematical Programming) is not a solver but an (experimental) framework for automated mathematical programming reformulations. The idea behind EMP is that new upcoming types of models which currently cannot be solved reliably are reformulated into models of established math programming classes in order to use mature solver technology. At this stage, EMP supports the modeling of Bilevel Programs, Variational Inequalities, Disjunctive Programs, Extended Nonlinear Programs and Embedded Complementarity Systems.
Extended mathematical programs are collections of functions and variables joined together using specific optimization and complementarity primitives. EMP annotates the existing relationships within a model to facilitate higher level structure identification. A specific implementation of this framework is outlined that reformulates the original GAMS model automatically using directives contained in an "empinfo" file into an equivalent model that can be solved using existing GAMS solvers.
The reformulation is done by the solver JAMS which currently is the only solver that is capable of handling EMP models. Examples showing how to use the EMP framework and the solver JAMS are made available through the GAMS EMP Library which is included in the GAMS Distribution. In order to generate a copy of and EMPLIB model, one can use the library facility of the GAMS IDE, or execute the command line directive emplib <modelname>
, where modelname is the (stem of the) file containing the model.
EMP has been developed jointly by Michael Ferris of UW-Madison, Ignacio Grossmann of Carnegie Mellon University, and GAMS Development Corporation. EMP and JAMS come free of charge with any licensed GAMS system but require a subsolver to solve the generated models.
JAMS: a reformulation tool
EMP models are currently processed by the JAMS solver. The solver JAMS creates a scalar version of the given GAMS model. This scalar version of the model is then solved by an appropriate subsolver. By default, there are no reformulations carried out, so the model generated is simply a GAMS scalar form of the model the actual subsolver will process. The subsolver used is by default the currently specified solver for the given model type.
The JAMS Option File
As with any GAMS solver, JAMS has an option file, typically called jams.opt
. A JAMS option subsolver is available to change the subsolver used for the reformulated model, along with an option to utilize a subsolver option file (subsolveropt).
The actual scalar version of the model can also be seen by the modeler using the option filename. For example, the option file
subsolver path subsolveropt 1 filename mcpmod.gms
when applied to an EMP model that is a complementarity problem will create a file called mcpmod.gms
in the current directory and solve that model using the solver PATH utilizing any options for PATH that are specified in path.opt
. The scalarized model is not particularly useful to look at since all of the original variables have been renamed into a scalar form. The mapping between original variables and the ones used in the scalar version of the model is given in a dictionary file that can also be seen by the modeler using the dict option. If a user simply wants to generate this scalar model, then the option terminate will not solve the generated model.
After the scalar version of the model is solved, the solution values are mapped back into the original namespace and returned to the modeler as usual in the listing file. The JAMS option margtol allows the modeler to suppress reporting marginals that have (absolute) values smaller than this tolerance.
Obviously, all of the above functionality is not of much value: the key part of JAMS is to interpret additional directives to take the original model and produce a reformulated scalar model. This is carried out using an "empinfo" file. The syntax and use of this file is the content of the remaining sections of this document.
The option EMPInfoFile allows the user to specify the path and name of a file containing additional EMP information. The syntax of this file is described by examples elsewhere in this document and in section Empinfo file details . There is a subtlety that should be mentioned when a user writes this file in the default location in the GAMS scratch directory of the current run specified in the gams file using:
file empinfo / '%emp.info%' /;
wherein certain additional formatting instructions are given. If instead, the file handle is given by:
file empinfo / 'empinfo.txt' /;
for example, then to produce the same file the following lines must be added to the gams source file:
empinfo.pc = 8; empinfo.pw = 255;
The following tables list all available options.
Reformulation Options
General Options
Option | Description | Default |
---|---|---|
CompModel | Complementarity model (MCP or VI) to generate for equilibrium problems. | mcp |
Dict | Generate a dictionary file that maps original variable and equation names to the ones of the scalar JAMS model. | dict.txt |
DisjBinRelax | Relax requirement that disjunction variables are binary. | 0 |
EMPInfoFile | Path and name of file containing additional EMP information as disjunctions, perpendiculars, bilevel characterization etc. By default, the file is expected to exist in the GAMS scratch directory of the current run ( gams.scrdir ). If this option is used, EMP searches for the file based on the provided relative or absolute path. In case of a given relative path, it is assumed to be relative to the working directory. | |
FileName | Filename of generated scalar reformulated GAMS model. | emp.gms |
ImplVarModel | Reformulation model to generate for implicit variables.Replication : Shared variables are replicated for each agentSubstitution : Shared multipliers are substituted out: assumes implicit functionSwitching : Switch function-variable matching to avoid replication | switching |
NLConsToFunc | Stick nonlinear constraints into functional part when generating a VI model for equilibrium problems | 0 |
SharedEqu | Allow shared equations in equilibrium problems | 0 |
SubSolvePar | User defined GAMS parameters for subsolve | |
SubSolver | Subsolver used to solve the reformulated model. The subsolver chosen has to be suitable for the chosen reformulation type. The user can also provide a solver related option file, see SubSolverOpt. | |
SubSolverOpt | Optfile value to pass to the subsolver Range: { 1 , ..., 999 } | 1 |
Terminate | Generate the GAMS source code of the reformulated model in a file and terminate without solving the model. | 0 |
UserPFFile | Filename of extra GAMS options when running the reformulated GAMS model. | |
ZipDebug | Zip-file name to create if the internal GAMS model fails | none |
Forming Optimality Conditions: NLP2MCP
The first nontrivial use of the JAMS solver is to automatically generate the first order conditions of a linear or nonlinear program; essentially we reformulate the optimization problem as a mixed complementarity problem (MCP). The "empinfo" file to do this simply contains the following line:
modeltype mcp
Behind the scenes, JAMS forms the Lagrangian of the nonlinear program and then forms its Karush-Kuhn-Tucker optimality conditions. To be clear, given the original nonlinear program
\begin{equation} \tag{NLP} \min_x f(x) \text{ s.t. } g(x) \leq 0, \; h(x) = 0, \end{equation}
the Lagrangian is:
\[ \mathcal{L}(x,\lambda,\mu) = f(x) - \langle{\lambda},{g(x)}\rangle - \langle{\mu},{h(x)}\rangle . \]
The first order conditions are the following MCP:
\[ \begin{aligned} 0 & = \nabla_{x} \mathcal{L}(x,\lambda,\mu) & \quad \perp \quad x & \text{ free } \\ 0 & \geq -\nabla_{\lambda} \mathcal{L}(x,\lambda,\mu) & \quad \perp \quad \lambda & \leq 0 \\ 0 & = -\nabla_{\mu} \mathcal{L}(x,\lambda,\mu) & \quad \perp \quad \mu & \text{ free } \end{aligned} \]
A specific example is:
\[ \begin{array}{lll} \displaystyle \min_{x,y,z} & -3x + y \\ \text{s.t.} & x + y \leq 1,\; x + y - z = 2,\; x, y \geq 0 \end{array} \]
which is found in the EMPLIB model nlp2mcp:
The putclose line writes out the default "empinfo" file whose location is provided in the system string %emp.info%
. Armed with this additional information, the EMP tool automatically creates the following MCP:
\[ \begin{aligned} 0 & \leq -3 - \lambda - \mu & \quad \perp \quad x & \geq 0 \\ 0 & \leq 1 - \lambda - \mu & \quad \perp \quad y & \geq 0 \\ 0 & = \mu & \quad \perp \quad z & \text{ free} \\ 0 & \geq x + y - 1 & \quad \perp \quad \lambda & \leq 0 \\ 0 & = x + y - z - 2 & \quad \perp \quad \mu & \text{ free}. \end{aligned} \]
Soft Constraints
In many cases, we wish to relax certain constraints in a model during solution (to help identify feasibility issues for example). As an example, consider the problem
\begin{align*} \min_{x_1,x_2,x_3}\; & \exp(x_1) \\ \text{s.t. } & \log(x_1) = 1, \\ & x_2^2 \leq 2, \\ & x_1/x_2 = \log(x_3), \\ & 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0, \end{align*}
which can be formulated in GAMS as
Reformulation as a classical NLP
Soft constraints allow us to treat certain equations in the model as "soft" by removing the constraints and adding a penalty term to the objective function. Explicitly, we replace the above problem by:
\begin{align*} \min_{x_1,x_2,x_3}\; & \exp(x_1) + 5 \left\Vert{\log(x_1) - 1}\right\Vert^2 + 2\max(x_2^2 - 2, 0) \\ \text{s.t. } & x_1/x_2 = \log(x_3), \\ & 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0 . \end{align*}
In this problem, we still force \(x_1/x_2 = \log(x_3)\), but apply a least squares penalty to \(\log(x_1) - 1\) and a smaller one-sided penalization to \(x_2^2 - 2\).
The above formulation is nonsmooth due to the "max" term in the objective function; in practice we would replace this by:
\begin{align*} \min_{x_1,x_2,x_3,w}\; & \exp(x_1) + 5 \left( \log(x_1) - 1 \right)^2 + 2w \\ \text{s.t. } & x_1/x_2 = \log(x_3), \\ & 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0 \\ & w \geq x_2^2 - 2, w \geq 0 \end{align*}
and recover a standard form NLP.
The "empinfo" file:
modeltype NLP adjustequ f1 sqr 5 f2 maxz 2
coupled with replacing the last line with
solve enlpemp using emp min obj;
achieves this goal. The parameter values provide the penalty coefficients above.
Reformulation as an MCP
As an alternative, we can rewrite the problem as an MCP, also dealing explicitly with the nonsmoothness. The "empinfo" file is given by:
modeltype NLP adjustequ f1 sqr 5 f2 maxz 2
and this generates the following MCP:
\[ \begin{aligned} 0 & = \log(x_1) - 1 + y_1/10 & \quad \perp \quad y_1 & \text{ free,} \\ 0 & \leq x_2^2 - 2 & \quad \perp \quad y_2 & \geq 0, \\ 0 & = x_1/x_2 - \log(x_3) & \quad \perp \quad y_3 & \text{ free,} \\ 0 & \geq 3x_1 + x_2 - 5 & \quad \perp \quad y_4 & \leq 0, \\ 0 & \leq exp(x_1) - y_1/x_1 - y_3/x_2 - 3y_4 & \quad \perp \quad x_1 & \geq 0, \\ 0 & \leq - 2y_2 x_2 + x_1y_3/x_2^2 - y_4 & \quad \perp \quad x_2 & \geq 0, \\ 0 & = y_3/x_3 & \quad \perp \quad x_3 & \text{ free,} \end{aligned} \]
where y
represent the multipliers.
A complete description of the process to derive this MCP will be given later in section Extended Nonlinear Programs .
Bilevel Programs
Mathematical programs with optimization problems in their constraints have a long history in operations research including [14, 28, 70] . New codes are being developed that exploit this structure, at least for simple hierarchies, and attempt to define and implement algorithms for their solution.
The simplest case is that of bilevel programming, where an upper level problem depends on the solution of a lower level optimization. For example:
\[ \begin{aligned} \min_{x,y}\; & f(x,y) \\ \text{s.t. } & g(x,y) \leq 0, \\ & y \text{ solves } \begin{aligned}[t] \min_y\; & v(x,y) \text{ s.t. } & h(x,y) \geq 0 . \end{aligned} \end{aligned} \]
Often, the upper level is referred to as the "leader", while the lower level is the "follower".
This problem can be reformulated as a Mathematical Program with Complementarity Constraints (MPCC) by replacing the lower level optimization problem by its first order optimality conditions:
\[ \begin{aligned} \min_{x,y}\; & f(x,y) \\ \text{s.t. } & g(x,y) \leq 0, \\ & 0 = \nabla_y v(x,y) - \lambda^T \nabla_y h(x,y) \perp x \text{ free} \\ & 0 \leq h(x,y) \perp \lambda \geq 0 . \end{aligned} \]
We find a solution of the MPCC, not of the bilevel program. This approach allows the MPCC to be solved using the NLPEC code, for example. Note that this reformulation is potentially problematic. First order conditions require theoretical assumptions to be necessary and sufficient for local optimality. There may be cases where the lower level problem has multiple local solutions, but the modeler really was interested in the global solution. The approach here may not produce this solution, even if a global solver is used within NLPEC.
The following example is Example 5.1.1, page 197 from [14] . Mathematically, the problem is
\[ \begin{aligned} \min_{x,y}\; & x - 4y \\ \text{s.t. } & y \text{ solves } \begin{aligned}[t] \min_y\; & y & \\ \text{ s.t. } & x+y \geq 3 \\ & 2x - y \geq 0\\ & -2x - y \geq -12\\ & -3x + 2y \geq -4 \end{aligned} \end{aligned} \]
and the EMPLIB model bard511 contains the following code:
Note that first the functions that form the objectives and constraints of the model are defined and the model is assembled. The $echo
line writes the "empinfo" file and states that the lower level problem involves the objective objin
which is to be minimized by choice of variables y
subject to the constraints specified in (defin
), e1
, e2
, e3
, and e4
.
Note that the variables \(x\) are declared to be variables of the upper level problem and this example has no upper level constraints \(g\). Having written the problem in this way, the MPCC is generated automatically, and passed on to a solver. In the case where that solver is NLPEC, a further reformulation of the model is carried out to convert the MPCC into an equivalent NLP or a parametric sequence of NLP's.
Further examples of bilevel models in EMPLIB are named: bard*
, ccmg74
, ccmg153
, flds*
, jointc1
, jointc2
, mirrlees
, transbp
.
The EMP model type allows multiple lower level problems to be specified within the bilevel format. An example of this is given in EMPLIB as ccmg71. The equations and objectives are specified in the normal manner; the only change is the definition of the "empinfo" file, shown below:
This corresponds to a bilevel program with two followers, both solving minimization problems. The first follower minimizes the objective function h1
(defined in defh1
) over the variables u1
, u2
, u3
, and u4
subject to the constraint given in e1
. The second followers problem is defined analogously next. Note that h1
involves the variables x1
, x2
, x3
, and x4
that are optimization variables of the leader. The constraint in e1
could also include these variables, and also the variables v1
, v2
, v3
, or v4
of the second follower, but all of these would be treated as parameters by the first follower.
The actual model (ccmg71) in EMPLIB uses a shortcut notation:
In the followers problem, the '*' notation indicates that this agent will optimize over all the variables used in defh1
and e1
that are not under the control of any other follower or the leader. In this case, this means u1
, u2
, u3
, and u4
. To avoid confusion, it is recommended that the modeler explicity names all the variables in each followers problem as shown before.
Variational Inequalities
A variational inequality \(VI(F,X)\) is to find \(x \in X\):
\[ \langle{F(x)},{(z - x)}\rangle \geq 0, \text{ for all } z \in X . \]
Here \(X\) is a closed (frequently assumed convex) set, defined for example as
\[ X = \left\{ x \;|\; x \geq 0, h(x) \geq 0 \right\}. \]
Note that the first-order (minimum principle) conditions of a nonlinear program
\[ \min_{z \in X} f(z) \]
are precisely of this form with \(F(x) = \nabla f(x)\).
It is well known that such problems can be reformulated as complementarity problems when the set \(X\) has the representation \(\left\{ x \;|\; x \geq 0, h(x) \geq 0 \right\}\) by introducing multipliers λ on the constraints \(h(x)\geq0\):
\[ \begin{aligned} 0 & \leq F(x) - \langle{\lambda},{\nabla h(x)}\rangle & \quad \perp \quad x & \geq 0 \\ 0 & \leq h(x) & \quad \perp \quad \lambda & \geq 0 . \end{aligned} \]
If \(X\) has a different representation, this construction would be modified appropriately.
A simple two dimensional example may be useful to improve understanding. Let
\[ F(x) = \begin{bmatrix} x_1 + 2 \\ x_1 + x_2 - 3 \end{bmatrix},\; X = \left\{x \geq 0\right\}{x_1 + x_2 \leq 1}, \]
so that \(F\) is an affine function, but \(F\) is not the gradient of any function \( f: \mathbf{R}^2 \to \mathbf{R} \). For this particular data, \(VI(F,X)\) has a unique solution \(x = (0,1)\).
Note that the first lines of this file define the \(F\) and \(h\) using standard GAMS syntax and include the defining equations in the model simpleVI
. The extension is the annotation "empinfo" file that indicates certain equations are to be treated differently by the EMP tool. The annotation simply says that the model is a VI (vi
) that pairs F
with x
and that the remaining (unpaired) equations form the constraint set \(X\). (Alternative notation allows the keyword vi
to be replaced by vifunc
.) Thus, model equations F
define a function \(F\) that is to be part of a variational inequality, while the equations h
define constraints of \(X\). It is also acceptable in this setting to use the "empinfo" file defined by:
putclose fx 'vi F x';
In this case, by default, any equations that are given in the model statement but not included as a pair in the vi
statement are automatically used to form \(X\). An alternative way to write this model without using sameas
is given in EMPLIB as affinevi.
Some subtleties related to VI's are demonstrated in the EMPLIB model zerofunc. In this model, the set \(X\) is defined using variables \(y\) and \(z\), for which \(z\) does not appear in the definition of \(F \equiv F_y\). In this case, the variable \(z\) is then matched with a "0" function. The "empinfo" file can be written in a number of different ways:
putclose fx 'vi F_y y';
or
putclose fx 'vi z F_y y';
or
putclose fx 'vi z F_y y gCons';
or
putclose fx 'vi F_z z F_y y gCons';
where F_z
simply defines the zero function. To some extent, our preferred notation is the one listed third: it explicitly includes all the variables and constraints that are present in the model and does not require the modeler to define F_z
at all.
Further example models in EMPLIB are simplevi, simplevi2, simplevi3, target, traffic, traffic2, and transvi.
Note also that the lower level problems of a bilevel program could be VI's instead of optimization problems - these problems are called Mathematical Programs with Equilibrium Constraints (MPEC) in the literature. Note that since MPCC is a special case of MPEC, the GAMS model type MPEC covers both. An example demonstrating this setup is given in EMPLIB as multmpec. The actual model to solve is:
\begin{align*} \min_{u,v,w,z} & z \\ \text{s.t. } & \exp(z) + w = 2, z \geq 1 \\ & (u,v) \text{ solves } \text{VI}( [v+w+z-1; u-\log(v)], \left\{ (u,v) \;|\; u \geq 0, v \geq 0 \right\} ) \\ & w \text{ solves } \text{VI}( w+z+3, \left\{ w \;|\; w \text{ free} \right\} ) \end{align*}
Note that the two VI's (due to the definitional sets) correspond respectively to a complementarity problem:
\[ \begin{aligned} 0 & \leq v + w + z - 1 & \quad \perp \quad u & \geq 0 \\ 0 & \leq u - \log(v) & \quad \perp \quad v & \geq 0 \end{aligned} \]
and a linear equation:
\[ w + z + 3 = 0 \]
The actual GAMS code is as follows:
The initial level value for v
(v.l = 1
) is needed to protect the evaluation of \(\log(v)\). The two complementarity problems are specified in the empinfo file (it is not really necessary to split these VI's apart, but it may give information to a solver that can be exploited). It is of course possible to write the MPCC directly in GAMS using the model type MPEC instead of EMP.
Embedded Complementarity Systems
A different type of embedded optimization model that arises frequently in applications is:
\[ \begin{array}{lll} \displaystyle \max_{x} & f(x,y) & \\ \text{s.t.} & g(x,y) \leq 0 & (\perp p \geq 0) \\[1em] & H(x,y,p) = 0 & (\perp y \text{ free}) \end{array} \]
Note the difference here: the optimization problem is over the variable \(x\), and is parameterized by the variable \(y\). The choice of \(y\) is fixed by the (auxiliary) complementarity relationships depicted here by \(H\). Note that the \(H\) equations are not part of the optimization problem, but are essentially auxiliary constraints to tie down remaining variables in the model.
A specific example is:
\[ \begin{array}{lll} \displaystyle \max_{x} & x & \\ \text{s.t.} & x + y \leq 1 & \\[1em] & -3x + y = 0.5 & (\perp y \text{ free}) \end{array} \]
which is found in the EMPLIB model simpequil2:
In order that this model can be processed correctly as an EMP, the modeler provides additional annotations to the model defining equations in an "empinfo" file. Specifically, first it is indicated that the problem is an equilibrium problem involving one or more agent problems. Next, the first agent is defined as an optimizer (over x
). Finally, the second agent is defined as solving a VI in y
. Armed with this additional information, the EMP tool automatically creates the following MCP:
\[ \begin{aligned} 0 & \leq -1 + p & \quad \perp \quad x & \geq 0 \\ 0 & \leq 1 - x - y & \quad \perp \quad p & \geq 0 \\ 0 & = -3x + y - 0.5 & \quad \perp \quad y & \text{ free,} \end{aligned} \]
(which is formed by the steps we outline below). EMP explicitly enforces the rule that every variable and constraint is under the control of exactly one agent. Thus a constraint or a variable cannot appear in both the max problem and the VI problem.
The above example is slightly simpler than the general form described above in which \(H\) is a function of \(x\), \(y\), and \(p\), the multiplier on the constraint of the optimization problem. The problem is that we do not have that variable around in the model code if we only specify the optimization problem there. This occurs for example in the classical PIES Model due to Hogan. In this setting, the problem is described by a linear program
\[ \begin{array}{ll} \min_x & c^Tx \\ \text{s.t.} & Ax = q(p) \\ & Bx = b \\ & x \geq 0 \end{array} \]
in which the quantity \(q\) is a function of \(p\), which is a multiplier on one of the LP constraints. To do this in EMP, we simply add the annotation:
where dembal
is the name of the constraint for which p
needs to be the multiplier. The full model is found in the EMPLIB model pies. Two final points: the dualvar
directive identifies the variable p
with the multiplier on the dembal
constraint, and all variables and constraints must be owned by a single agent. In this case, since there is only one agent (the minimizer), all constraints of the model are explicitly claimed, along with all variables except for p
. However, next, p
is identified with the dembal
constraint, which is owned by the min agent, and hence p
is also owned by that agent.
There are several shorthands possible here. The first is that the explicit statement of variables can be replaced by the '*' form:
Alternatively, an even shorter version is possible since there is only one agent present in this model, namely:
Note that in this form, all the variables and constraints of the original model are included in the (single) agents problem, and the original variable p
is identified in the constructed MCP with the multiplier on the dembal
constraint.
In the general case where the "empinfo" file contains all three lines:
min x optcons vi vicons y dualvar p optcons
namely that the function \(H\) that is defined in vicons
is complementary to the variable y
(and hence the variable y
is a parameter to the optimization problem), and furthermore that the dual variable associated with the equation optcons in the optimization problem is one and the same as the variable p
used to define \(H\), the EMP tool automatically creates the following MCP:
\[ \begin{aligned} 0 & = \nabla_{x} \mathcal{L}(x,y,p) & \quad \perp \quad x & \text{ free } \\ 0 & \geq -\nabla_{p} \mathcal{L}(x,y,p) & \quad \perp \quad p & \leq 0 \\ 0 & = H(x,y,p) & \quad \perp \quad y & \text{ free,} \end{aligned} \]
where the Lagrangian is defined as
\[ \mathcal{L}(x,y,p) = f(x,y) - \langle{p},{g(x,y)}\rangle . \]
Essentially, this MCP consists of the first order optimality conditions of the optimization problem, coupled with the VI that is the second agents problem. An example that does both of these things together is provided in EMPLIB as scarfemp-primal.
Note that since the PIES model has no \(y\) variables, this is a special case of the general form in which the second agents (VI) problem is simply not present.
Example models are ferris43, flipper pies, scarfemp-dual, simpequil, transecs, and transeql.
MOPECs
MOPECs (Multiple Optimization Problems with Equilibrium Constraints) are a simple conceptual extension of the aforementioned embedded complementarity system. Instead of having a single optimizing agent and a variational inequality, one instead allows multiple optimizing agents coupled with an equilibrium constraint (the variational inequality).
Perhaps the most popular use of this formulation is where competition is allowed between agents. A standard method to deal with such cases is via the concept of Nash Games. In this setting \(x^*\) is a Nash Equilibrium if
\[ x_i^* \in \arg \min_{x_i \in X_i} \ell_i(x_i, x_{-i}^*,q), \forall i\in \mathcal{I}, \]
where \(x_{-i}\) are other players decisions and the quantities \(q\) are given exogenously, or via complementarity:
\[ 0 \leq H(x,q) \quad \perp \quad q \geq 0 .\]
This mechanism is extremely popular in economics, and Nash famously won the Nobel Prize for his contributions to this literature.
This format is again an EMP, more general than the example given above in two respects. Firstly, there is more than one optimization problem specified in the embedded complementarity system. Secondly, the parameters in each optimization problem consist of two types. Firstly, there are the variables \(q\) that are tied down by the auxiliary complementarity condition and hence are treated as parameters by the \(i\)th Nash player. Also there are the variables \(x_{-i}\) that are treated as parameters by the \(i\)th Nash player, but are treated as variables by a different player \(j\).
While we do not specify the syntax here for these issues, EMPLIB provides examples that outline how to carry out this matching within GAMS. Finally, two points of note: first it is clear that the resulting model is a complementarity problem and can be solved using PATH, for example. Secondly, performing the conversion from an embedded complementarity system or a Nash Game automatically is a critical step in making such models practically useful.
We note that there is a large literature on discrete-time finite-state stochastic games: this has become a central tool in analysis of strategic interactions among forward-looking players in dynamic environments. The model of dynamic competition in an oligopolistic industry given in [54] is exactly in the format described above, and has been used extensively in applications such as advertising, collusion, mergers, technology adoption, international trade and finance. Ongoing work aims to use the EMP format to model these problems.
Extended Nonlinear Programs
Optimization models have traditionally been of the form (NLP). Specialized codes have allowed certain problem structures to be exploited algorithmically, for example simple bounds on variables. However, for the most part, assumptions of smoothness of \(f\), \(g\) and \(h\) are required for many solvers to process these problems effectively. In a series of papers, Rockafellar and colleagues [151, 152, 153] have introduced the notion of extended nonlinear programming, where the (primal) problem has the form:
\begin{equation} \tag{ENLP} \min_{x \in X} f (x) + \theta(g_1(x),\ldots,g_m(x)). \end{equation}
In this setting, \(X\) is assumed to be a nonempty polyhedral set, and the functions \(f,g_1,\dotsc,g_m\) are smooth. The function \(\theta\) can be thought of as a generalized penalty function that may well be nonsmooth. However, when \(\theta\) has the form
\begin{equation} \tag{$\theta$ conj} \theta(u) = \sup_{y \in Y} \{ \langle y,u\rangle - k(y) \}, \end{equation}
a computationally exploitable and theoretically powerful framework can be developed based on conjugate duality. A key point for computation and modeling is that the function \(\theta\) can be fully described by defining the set \(Y\) and the function \(k\). Furthermore, from a modeling perspective, an extended nonlinear program can be specified simply by defining the functions \(f,g_1,\dotsc,g_m\) in the manner already provided by the modeling system, with the additional issue of simply defining \(Y\) and \(k\). Conceptually, this is not much harder than what is carried out already, but leads to significant enhancements to the types of models that are available. Once a modeler determines which constraints are treated via which choice of \(k\) and \(Y\), the EMP model interface automatically forms an equivalent variational inequality or complementarity problem. As we show later, there may be alternative formulations that are computationally more appealing; such reformulations can be generated using different options to JAMS.
Forms of theta
The EMP model type makes the problem format (ENLP) available to users in GAMS. As special cases, we can model piecewise linear penalties, least squares, and \(L_1\) approximation problems, as well as the notion of soft and hard constraints.
For ease of exposition, we now describe a subset of the types of functions \(\theta\) that can be generated by particular choices of \(Y\) and \(k\). In many cases, the function \(\theta\) is separable, that is
\[ \theta(u) = \sum_{i=1}^m \theta_i(u_i), \]
so we can either specify \(\theta_i\) or \(\theta\) itself.
Extended nonlinear programs include the classical nonlinear programming form (NLP) as a special case. This follows from the observation that if \(K\) is a closed convex cone, and we let \(\psi_K\) denote the "indicator function" of \(K\) defined by:
\[ \psi_K(u) = \begin{cases} 0 & \text{if $u \in K$}, \\ \infty & \text{else,} \end{cases} \]
then (NLP) can be rewritten as:
\[ \min_x f(x) + \psi_K((g(x),h(x)),\qquad K = \mathbf{R}^m_- \times \{ 0 \}^p , \]
where \(m\) and \(p\) are the dimensions of \(g\) and \(h\) respectively and \(\mathbf{R}^m_- = \left\{ u\in \mathbf{R}^m \;|\; u \leq 0 \right\}\). An elementary calculation shows that
\[ \psi_K (u) = \sup_{v \in K^\circ} \langle{u},{v}\rangle , \]
where \({K}^\circ = \left\{ u \;|\; \langle{u},{v}\rangle \leq 0, \forall v \in K \right\}\) is the polar cone of the given cone \(K\). Thus, when \(\theta(u) = \psi_K(u)\) we simply take \(k \equiv 0\) and \(Y = K^\circ\). In our example, \({K}^\circ = \mathbf{R}^m_+ \times \mathbf{R}^p\). To some extent, this is just a formalism that allows us to claim the classical case as a specialization; however when we take the cone \(K\) to be more general than the polyhedral cone used above, we can generate conic programs for example.
The second example involves a piecewise linear function \(\theta\): Formally, for \(u \in \mathbf{R}\),
\[ \theta(u) = \left\{ \begin{array}{ll} \rho u & \text{if $u \geq 0$}, \\ \sigma u & \text{else.} \end{array} \right. \]
In this case, simple calculations prove that \(\theta\) has the form (θ conj) for the choices
\[ k \equiv 0 \text{ and } Y = [\sigma,\rho]. \]
The special case where \(\sigma = -\rho\) results in
\[ \theta(u) = \rho \left| u \right|. \]
This allows us to model nonsmooth \(L_1\) approximation problems. Another special case results from the choice of \(\sigma = -\gamma\), \(\rho = 0\), whereby
\[ \theta(u) = \gamma \max \{-u,0 \} . \]
This formulation corresponds to a soft penalization on an inequality constraint, namely if \(\theta(-g_1(x))\) is used then nothing is added to the objective function if \(g_1(x) \leq 0\), but \(\gamma g_1(x)\) is added if the constraint \(g_1(x) \leq 0\) is violated. Contrast this to the classical setting above, where \(\infty\) is added to the objective if the inequality constraint is violated. It is interesting to see that truncating the set \(Y\), which amounts to bounding the multipliers, results in replacing the classical constraint by a linearized penalty.
The third example involves a more interesting choice of \(k\). If we wish to replace the "absolute value" penalization given above by a quadratic penalization (as in classical least squares analysis), that is
\[ \theta(u) = \gamma u^2 \]
then a simple calculation shows that we should take
\[ k(y) = \frac{1}{4\gamma} y^2 \text{ and }Y = \mathbf{R} . \]
By simply specifying this different choice of \(k\) and \(Y\) we can generate such models easily and quickly within the modeling system. Note, however, that the reformulation we would use in \(\theta(u) = \rho \left| u \right|\) and \(\theta(u) = \gamma u^2\) are very different as we shall explain in the simple example below. Furthermore, in many applications it has become popular to penalize violations using a quadratic penalty only within a certain interval, afterwards switching to a linear penalty (chosen to make the penalty function \(\theta\) continuously differentiable - see [98] ). That is:
\[ \theta(u) = \left\{ \begin{array}{ll} \gamma u -\frac{1}{2} \gamma^2 & \mbox{if $u \geq \gamma$} \\ \frac{1}{2} u^2 & \mbox{if $u \in [-\gamma,\gamma]$} \\ -\gamma u -\frac{1}{2} \gamma^2 & \mbox{else.} \end{array} \right. \]
Such functions arise from quadratic \(k\) and simple bound sets \(Y\). In particular, the somewhat more general function
\[ \theta(u) = \left\{ \begin{array}{ll} \gamma \beta^2 + \rho(u -\beta) & \text{if $u \geq \beta$} \\ \gamma u^2 & \text{if $u \in [\alpha,\beta]$} \\ \gamma \alpha^2 + \sigma(u -\alpha) & \text{else} \end{array} \right. \]
arises from the choice of
\[ k(y) = \frac{1}{4\gamma} y^2 \text{ and } Y = [\sigma,\rho] , \]
with \(\alpha = \frac{\sigma}{2\gamma}\) and \(\beta = \frac{\rho}{2\gamma}\).
The final example that we give is that of \(L_{\infty}\) penalization. This example is different to the examples given above in that \(\theta\) is not separable. However, straightforward calculation can be used to show
\[ \theta(u) = \max_{i=1,\ldots,m} u_i \]
results from the choice of
\[ k \equiv 0 \text{ and } Y = \left\{ y \in \mathbf{R}^m \;|\; y \geq 0, \sum_{i=1}^m y_i = 1 \right\}, \]
that is, \(Y\) is the unit simplex.
Underlying theory
The underlying structure of \(\theta\) leads to a set of extended optimality conditions and an elegant duality theory. This is based on an extended form of the Lagrangian:
\[ \begin{aligned} \mathcal{L}(x,y) & = f (x) + \sum_{i=1}^m y_i g_i(x) - k(y) \\ & x \in X, y \in Y \end{aligned} \]
Note that the Lagrangian \(\mathcal{L}\) is smooth - all the nonsmoothness is captured in the \(\theta\) function. The theory is an elegant combination of calculus arguments related to \(g_i\) and its derivatives, and variational analysis for features related to \(\theta\).
It is shown in [152] that under a standard constraint qualification, the first-order conditions of (ENLP) are precisely in the form of the following variational inequality:
\begin{equation} \tag{ENLP VI} \text{VI} \left( \begin{bmatrix} \nabla_x \mathcal{L}(x,y) \\ -\nabla_y \mathcal{L} (x,y) \end{bmatrix}, X \times Y \right) . \end{equation}
When \(X\) and \(Y\) are simple bound sets, this is simply a complementarity problem.
Note that EMP exploits this result. In particular, if an extended nonlinear program of the form (ENLP) is given to EMP, then the optimality conditions (ENLP VI) are formed as a variational inequality problem and can be processed as outlined above. For a specific example, we cite the fact that if we use the (classical) choice \(k \equiv 0\) and \(Y = K^\circ\), then the optimality conditions of (ENLP) are precisely the standard complementarity problem given as (ENLP VI). While this is of interest, we believe that other choices of \(k\) and \(Y\) may be more useful and lead to models that have more practical significance.
Under appropriate convexity assumptions on this Lagrangian, it can be shown that a solution of the VI (ENLP VI) is a saddle point for the Lagrangian on \(X \times Y\). Furthermore, in this setting, the saddle point generates solutions to the primal problem (ENLP) and its dual problem:
\[ \max_{y \in Y} d(y),\; \text{ where } d(y) = \inf_{x \in X} \mathcal{L}(x,y), \]
with no duality gap.
Unfortunately, the perturbations \(y\) in Rockafellar's theory are precisely the negative of those used throughout the GAMS system. Thus, we need to replace \(y\) by \(-y\) throughout in the above to recover the same multipliers as those GAMS uses.
A simple example
As an example, consider the problem
\begin{align*} \min_{x_1,x_2,x_3}\; & \exp(x_1) + 5 {\left\Vert \log(x_1) - 1 \right\Vert}^2 + 2\max(x_2^2 - 2, 0) \\ \text{s.t. } & x_1/x_2 = \log(x_3), \\ & 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0 . \end{align*}
In this problem, we would take
\begin{align*} X = \left\{ x \in \mathbf{R}^3 \;|\; 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0 \right\}. \end{align*}
The function \(\theta\) essentially treats 3 separable pieces:
\begin{align*} g_1(x) & = \log(x_1) - 1, \\ g_2(x) & = x_2^2 - 2, \\ g_3(x) & = x_1/x_2 - \log(x_3) . \end{align*}
A classical problem would force \(g_1(x) = 0\), \(g_2(x) \leq 0\). and \(g_3(x) = 0\), while minimizing \(f(x) = \exp(x_1)\). In our problem, we still force \(g_3(x) = 0\), but apply a (soft) least squares penalty on \(g_1(x)\) and a smaller one-sided penalization on \(g_2(x)\). The above formulation is nonsmooth due to the "max" term in the objective function; in practice we could replace this by:
\begin{align*} \min_{x_1,x_2,x_3,w}\; & \exp(x_1) + 5 {\left\Vert \log(x_1) - 1 \right\Vert}^2 + 2w \\ \text{s.t. } & x_1/x_2 = \log(x_3), \\ & 3x_1 + x_2 \leq 5, x_1 \geq 0, x_2 \geq 0 \\ & w \geq x_2^2 - 2, w \geq 0 \end{align*}
and recover a standard form NLP. If the penalty on \(g_1(x)\) would be replaced by a one-norm penalization (instead of least squares), we would have to play a similar game, moving the function \(g_1(x)\) into the constraints and adding additional variable(s). To some extent, this seems unnatural - a modeler should be able to interchange the penalization without having to reformulate the problem from scratch. The proposed extended NLP would not be reformulated at all by the modeler, but allows all these "generalized constraints" to be treated in a similar manner within the modeling system. The actual formulation would take:
\[ \theta(u) = \theta_1(u_1) + \theta_2(u_2) + \theta_3(u_3) \]
where
\begin{align*} \theta_1(u_1) & = 5 u_1^2, \\ \theta_2(u_2) & = 2 \max(u_2, 0), \\ \theta_3(u_3) & = \psi_{\{0\}} (u_3) . \end{align*}
The discussion above allows us to see that
\begin{align*} Y & = -(\mathbf{R} \times [0,2] \times \mathbf{R}), \\ k(y) & = \frac{1}{20} y_1^2 + 0 + 0 . \end{align*}
The corresponding Lagrangian is the smooth function:
\[ \mathcal{L}(x,y) = f(x) - \sum_{i=1}^3 y_i g_i(x) - k(y) . \]
The corresponding VI (ENLP VI) can almost be formulated in GAMS (except that the linear constraint in \(X\) cannot be handled currently except by introducing a \(\theta_4(x)\)). Thus
\[ g_4(x) = 3x_1 + x_2 - 5,\; \theta_4(u) = \psi_{{\mathbf{R}}_-} \]
resulting in the following choices for \(Y\) and \(k\):
\begin{align*} Y & = -(\mathbf{R} \times [0,2] \times \mathbf{R} \times {\mathbf{R}}_+), \\ k(y) & = \frac{1}{20} y_1^2 + 0 + 0 + 0 . \end{align*}
Since \(X\) and \(Y\) are now simple bound sets, (ENLP VI) is now a complementarity problem and can be solved for example using PATH. A simple "empinfo" file details the choices of \(Y\) and \(k\) from the implemented library:
Adjustequ e1 sqr 5 e2 MaxZ 2
Reformulation as a classical NLP
Suppose
\[ \theta(u) = \sup_{y \in Y} \{ {\langle{u},{y}\rangle} - \frac{1}{2} {\langle{y},{Q y}\rangle} \} \]
for a polyhedral set \(Y \in {\mathbf{R}}^m\) and a symmetric positive semidefinite \(Q \in {\mathbf{R}}^{m \times m}\) (possibly \(Q = 0\)).
Suppose further that
\begin{align*} X & = \left\{ x \;|\; Rx \leq r \;\; \right\}, Y =\left\{ y \;|\; S^T y \leq s \right\},\\ Q & = DJ^{-1}D^T,\;\; F(x) = (g_1(x),\ldots,g_m(x)), \end{align*}
where \(J\) is symmetric and positive definite (for instance \(J = I\)). Then, as outlined by [153] , the optimal solutions \(\overline{x}\) of (ENLP) are the \(\bar{x}\) components of the optimal solutions \((\bar{x},\bar{z},\bar{w})\) to
\[ \begin{array}{ll} \min & f(x) + \langle{s},{z}\rangle + \frac{1}{2} \langle{w},{Jw}\rangle \\ \text{s.t.} & Rx \leq r, z \geq 0, F(x) - Sz - Dw = 0 . \end{array} \]
The multiplier on the equality constraint in the usual sense is the multiplier associated with \(\bar{x}\) in the extended Lagrangian for (ENLP). (Note that a Cholesky factorization may be needed to determine \(D\).)
It may be better to solve this reformulated NLP than to solve the VI (ENLP VI). However, it is important that we can convey all types of nonsmooth optimization problems to a solver as smooth optimization problems, and hence it is important to communicate the appropriate structure to the solver interface. We believe that specifying \(Y\) and \(k\) is a theoretically sound way to do this.
Disjunctive Programs (LogMIP)
There are many ways that the EMP model type can be used for further extensions to the modeling capabilities of a given system. In particular, the procedures outlined in [189, 72] for disjunctive programming extensions are also implemented within the EMP model type.
The disjunctive programming procedures are also known as the LogMIP 2.0 solver for solving linear and nonlinear disjunctive programming problems involving binary variables and disjunction definitions for modeling discrete choices. While the modeling and solution of these disjunctive optimization problems has not yet reached the stage of maturity and reliability as LP, MIP and NLP modeling, these problems have a rich area of applications.
LogMIP 2.0 has been developed by Dr. Aldo Vecchietti from INGAR (Santa Fe, Argentina) and Professor Ignacio E. Grossmann from Carnegie Mellon University (Pittsburgh, USA), and supersedes its previous version, LogMIP 1.0 (GAMS releases 22.6 (December 2007) to 23.6 (December 2010)). Changes in version 2.0 are at the level of language, where now the EMP syntax and modeltype is used, and at the level of solvers, where Big-M and convex-hull relaxations are combined. For more information see the website http://www.logmip.ceride.gov.ar/ and the documentation http://www.logmip.ceride.gov.ar/files/pdfs/newUserManual.pdf.
One simple example to highlight the disjunctive programming feature is the notion of an ordering of tasks, namely that either job \(i\) completes before job \(j\) starts or the converse, i.e., that the jobs cannot overlap. Such a disjunction can be specified using an empinfo file containing lines:
disjunction * seq(i,j) else seq(j,i)
In such an example, one can implement a Big-M method, employ indicator constraints, or utilize a convex hull reformulation. The convex hull reformulation is the default strategy; to utilize the Big-M formulation, the additional option
default bigm 1000
would add binary variables and constraints to impose the disjunction using a Big-M value of 1000. Alternatively, the option setting
default indic
writes out a model and an option file that implements a reformulation using indicator constraints, which can be handled by the solvers CPLEX, SCIP, and XPRESS. The EMPLIB model sequence is sequencing model that implements all of these options.
More complicated (nonlinear) examples make the utility of this approach clearer. The design of a multiproduct batch plan with intermediate storage described in [188] and a synthesis problem involving 8 processes from [185] are also included in the EMP model library. As a final example, the gasoline emission model outlined in [71] is precisely in the form that could exploit the features of EMP related to (nonlinear) disjunctive programming.
Empinfo file details
We have explained above many of the standard vectorized ways to write an "empinfo" file. The "empinfo" file has a vectorized format and a more powerful (but more complex) scalar version. We describe some of the features of the scalar version in this section.
An example of the use of the scalar syntax is given below
file info / '%emp.info%' /; put info / 'equilibrium'; loop(h, put / 'min' obj(h) /; loop(j, put x(h,j)); put /; loop(k, put z(h,k)); put / costdef(h) /; loop(i, put nodebal(h,i)); put /; ); loop(a, put 'vi',flowdef(a),f(a) /);
This is an example of a MOPEC in which the optimization agents are indexed by \(h\) and each of these agents control variables \(x_{hj}\) and \(z_{hk}\). The objective function for each \(h\) is defined in costdef
, the constraints of each minimization problem are defined in nodebal
, and the equilibrium constraints that link all the optimization problems are a VI defined by variables \(f_a\) and functions within flowdef
. Sometimes it is necessary to use syntax that generates the "tl" fields of sets, but this seems only necessary when abnormal side conditions are needed.
The format of the empinfo file is given below:
Disjunction [chull [big eps] | bigM [big eps threshold] | indic] [NOT] var|* [NOT] {equ} {ELSEIF [NOT] var|* [NOT] {equ}} [ELSE [NOT] {equ}] Default [chull [big eps] | bigM [big eps threshold] | indic] ParallelStep1 {equ|*} AdjustEqu equ abs|sqr|maxz|... {weight} ModelType MCP|NLP|MIP|... BiLevel {var} {MAX|MIN obj {var|*} {[-] equ}} {VI {var|*} {[-] equ var} {[-] equ}} {DualVar {var [-] equ}} Equilibrium {Implicit {var equ}} {VIsol {equ}} {MAX|MIN obj {var|*} {[-] equ}} {VI {var|*} {[-] equ var} {[-] equ}} {DualVar {var [-] equ}} VI {var|*} {[-] equ var} {[-] equ} DualEqu {[-] equ var} DualVar {var [-] equ} ------- [ ] optional | exclusive { } can be repeated