Table of Contents
Introduction
Stochastic programs with recourse employing a discrete distribution of the random parameters become a deterministic programming problem. They can be solved by an appropriate optimization algorithm, ignoring the stochastic nature of (some or all) parameters.
The libInclude file scenred.gms
, also known as ScenRed
can be used for the reduction of scenarios and scenario tree construction modeling the random data processes.
The scenario reduction algorithms included determine a scenario subset (of prescribed cardinality or accuracy) and assign optimal probabilities to the preserved scenarios. The reduced problem is then solved by a deterministic optimization algorithm provided by GAMS.
Scenario tree construction allows to construct scenario trees as accurate input for multistage stochastic programs. The input are individual scenarios in form of a fan.
- Note
- The libInclude file
scenred.gms
is based on the executable tool SCENRED2 and utilizes its functionality.
Background
Scenario Reduction
Many solution methods for stochastic programs employ discrete approximations of the uncertain data processes by a set of scenarios (i.e., possible outcomes of the uncertain parameters) with corresponding probabilities.
For most practical problems the optimization problem that contains all possible scenarios (the so-called deterministic equivalent program) is too large. Due to computational complexity and to time limitations this program is often approximated by a model involving a (much) smaller number of scenarios.
The reduction algorithms developed in [1,2] determine a subset of the initial scenario set and assign new probabilities to the preserved scenarios. All deleted scenarios have probability zero.
ScenRed uses two reduction algorithms: The Forward and the Backward method. In general, the computational performance (accuracy, running time) of the methods differ. For huge scenario trees, the Backward method has the best expected performance with respect to running time. The Forward method is the best algorithm when comparing accuracy, but it can only be recommended if the number of preserved scenarios is small (strong reduction). If no reduction method is selected, the method with the best expected performance with respect to running time is chosen.
The reduction algorithms exploit a certain probability distance of the original and the reduced probability measure. The probability distance trades off scenario probabilities and distances of scenario values. Therefore, deletion will occur if scenarios are close or have small probabilities.
The reduction concept is general and universal. No requirements on the stochastic data processes (e.g. the dependency or correlation structure of the scenarios, the scenario probabilities or the dimension of the process) or on the structure of the scenarios (e.g. tree-structured or not) are imposed. The reduction algorithms can be tailored to the stochastic model if the user provides additional information (How many decision stages are involved? Where do the random parameters enter the model – in objective and/or right hand sides and/or technology matrices?) The information is used to choose the probability distances.
References:
- J. Dupačová, N. Gröwe-Kuska, W. Römisch: Scenario Reduction in Stochastic Programming: An Approach Using Probability Metrics, Mathematical Programming, 95:493-511, 2003
- H. Heitsch, W. Römisch: Scenario Reduction Algorithms in Stochastic Programming, Computational Optimization and Applications, 24:187–206, 2003
Scenario Tree Construction
An important issue for solving multistage stochastic programs consists in the approximate representation of the (multivariate) stochastic input process in the form of a scenario tree.
A scenario tree represents how uncertainty unfolds over multiple stages, starting from a root node at the initial period and branching out to nodes in subsequent periods. Each node represents a state of the system at a specific time, and the branches represent possible transitions to future states. The final nodes, or leaves, correspond to the different scenarios considered in the model.
The construction of a scenario tree typically begins with a large set of individual scenarios, which are then systematically reduced to make the problem computationally manageable. This involves modifying the tree structure and grouping similar scenarios together. The goal is to retain a representative subset of scenarios that captures the essential features of the original set.
References:
- H. Heitsch and W. Römisch: Scenario tree modeling for multistage stochastic programs, Mathematical Programming, 118:371–406, 2009
Usage
Model and Data Preparation
The usage of ScenRed requires additional data preparation and reformulation of the GAMS program for the stochastic programming model:
- Analyze the GAMS program of the stochastic programming model.
Since the initial scenarios and a number of input parameters have to be specified, one must identify the corresponding components of the GAMS model and create or calculate them if they do not already exist. - Reformulate the GAMS program.
Check if the model can handle varying scenario or node probabilities, and whether the equations are defined in terms of a (possibly reduced) tree. If the model doesn't already contain a scenario tree, one should be added. If it does, it is a simple task to rewrite the equation definitions (and possibly other statements too) in terms of a subset of the original nodes or tree.
A reduction of the initial scenarios makes sense only if we are able to generate that part of the model that corresponds to the preserved scenarios (i.e. the reduced subtree). This is done by declaring a subset of the nodes in the original tree. The parameters and equations are declared over the original node set, but are defined over only the subtree. This will be illustrated by an example later in the section.
Further, one should verify that the model can handle changing probabilities. Many practical models involve scenarios with equal probabilities. This property will not be maintained by the probabilities in the reduced subtree.
As an example, consider the srkandw model in the GAMS model library, and the kand model upon which it is based. To produce srkandw
from kand
, we first reformulate the original to allow for solution over a reduced tree. To do this, we introduce a subset of the node set: set sn(n) 'nodes in reduced tree';
For convenience and clarity, we introduce a second subset at the same time, the set of leaf nodes: set leaf(n) 'leaf nodes in original tree';
as well as some code to compute this set based on the existing time-node mapping. We also declare a new parameter, the probabilities for the reduced tree: parameter sprob(n) 'node probability in reduced tree';
Once these are declared, we can quickly edit the equation definitions so that they run only over the reduced subtree: we simply substitute the reduced probabilities sprob
for the original prob
, and the reduced node set sn
for the original node set n
. Note that the declaration of the equations does not change.
This example illustrates one other change that may be required: the stochastic data must be in parameters having the node set as their last index. This is not the case in the kand
model, so we simply reversed the indices in the dem
parameter to meet the requirement in srkandw
. It is also possible to create a transposed copy of the original data and pass that to ScenRed if the original data cannot be changed conveniently.
Parameterization
Before using ScenRed, a couple of input and output parameters need to be defined. They are collected in the one-dimensional parameter ScenRedParms
. ScenRedParms can be included by the statement:
$libInclude scenred
- Note
- If the libInclude file
scenred.gms
is invoked without arguments,ScenRedParms
is loaded. Note that the libInclude must be stated before the actual scenario reduction/tree construction is done.
After this statement, all necessary settings can be made by assigning values to the corresponding parameters. A parameter value of zero (default) disables a setting. A list of the available parameters can be found below.
Table 1: Supported parameters in ScenRedParms
Symbol | Description |
---|---|
num_time_steps | path length from root to leaf |
num_leaves | leaves/scenarios in the initial tree |
num_nodes | nodes in the initial tree |
num_random | random variables assigned to a scenario or node |
red_num_leaves | desired number of preserved scenarios or leaves |
red_percentage | desired relative distance (accuracy), number from 0.0 to 1.0 |
reduction_method | desired reduction method, 1 - Forward, 2 - Backward, 0 - Default |
construction_method | desired tree construction method, 1 - Forward, 2 - Backward |
num_stages | number stages |
sroption | enable use of the option file |
run_time_limit | time limit in seconds |
report_level | report level: more messages by higher values |
visual_init | visualization initial tree |
visual_red | visualization reduced (constructed) tree |
plot_scen | visualization scenario processes |
out_scen | output of scenario raw data |
out_tree | output of scenario tree data |
The values for the parameters num_time_steps
, num_leaves
, num_nodes
, and num_random
are automatically derived from the input data. If the user sets these fields anyway, the program ensures that the internally calculated numbers and the numbers specified by the user match.
At least one of the parameters red_percentage
and red_num_leaves
must be set. The value of red_percentage
will be ignored if the parameter red_num_leaves
is non-zero. Otherwise, the tree will not be reduced if red_percentage=0
, while the reduction of the tree will be maximal (i.e. only one scenario will be kept) if red_percentage=1
. A numeric value of 0.4 means that the reduced tree maintains 60% of the information contained in the original tree. The reduction algorithms are skipped if red_num_leaves=num_leaves
or if red_num_leaves=0
and red_percentage=0
. These values can be assigned if the user wishes to run the scenario tree diagnostic.
Example:
The following statements describe a possible example setup for proceeding the scenario tree construction with visualization of the scenario tree and output of the scenarios to a raw data file afterwards. Note that the actual choice of the ScenRed action (scenario tree construction or scenario reduction) is determined when the libInclude is done, see below.
* general parameters
ScenRedParms('num_leaves') = 100;
ScenRedParms('num_nodes') = 200;
ScenRedParms('num_time_steps') = 5;
ScenRedParms('num_random') = 2;
ScenRedParms('report_level') = 2;
ScenRedParms('run_time_limit') = 30;
* execution commands
ScenRedParms('visual_red') = 1;
ScenRedParms('out_scen') = 1;
libInclude Statement
Once you have prepared all the inputs, you are ready to use ScenRed:
- Note
- Since the initial scenarios and a number of input parameters have to be specified, the corresponding components of the GAMS program have to be defined before the actual scenario reduction/tree construction is done. Reduced scenarios have to be defined before the equations of the (reduced) stochastic programming model are used in a solve statement. Therefore the
$libInclude scenred [args]
statement can be placed anywhere between the definitions of the GAMS parameters and the solve statement of the reduced stochastic programming model.
$libInclude scenred myprefix tree_con n tree p rtree rp rv1 rv2
Table 2: ScenRed arguments:
Argument | Description | |
---|---|---|
1 | myprefix | base name for files used with the libInclude |
2 | tree_con or scen_red | select the libInclude action: scenario tree construction or scenario reduction |
3 | n | the set of nodes in the tree |
4 | tree | the set of ancestor relations describing the tree |
5 | p | the parameter containing the node probabilities |
6 | rtree | the set of ancestor relations of the reduced tree (output) |
7 | rp | the parameter containing the node probabilities for the reduced tree (output) |
8 | rv1 , rv2 , ... | parameters containing random values of the nodes |
Argument 2 instructs ScenRed either to construct a tree (tree_con
) or to reduce a tree (scen_red
).
Argument 3 (here: n
) is the set of nodes making up the scenario tree. Note that the cardinality of this set is part of ScenRedParms
.
Argument 4 (here: tree
) is the ancestor mapping between the nodes. This mapping determines the scenario tree. Note that the mapping can be either an ancestor mapping (i.e. child-parent) or a successor mapping (parent-child). By default, an ancestor mapping is expected. If the check for this fails, a successor mapping is looked for.
Argument 5 (here: p
) is the parameter of probabilities for the nodes in the original tree. It is only required that probabilities for the scenarios (i.e. the leaf nodes) be provided, but the parameter can contain probabilities for the non-leaf nodes as well.
Argument 8 and following specify the parameter(s) that comprise the random values assigned to the initial scenarios, or to the nodes of the scenario tree. There can be more than one such parameter, included in any order. The only requirement is that the node set be the final index in each of these parameters.
The output arguments rtree
and rp
correspond to the symbols imported from the output file. The parameters ScenRedParms
and ScenRedReport
are invisibly communicated.
Instead of providing an explicit name for all the different files used, the first argument determines the name of all files using a simple naming scheme:
Filename | Command option | Description |
---|---|---|
sr2 myprefix.log | log_file | log file name |
sr2 myprefix_in.gdx | input_gdx | gdx input file name |
sr2 myprefix_out.gdx | output_gdx | gdx output file name |
sr2 myprefix.opt | sr_option | option file name |
sr2 myprefix_vi.plt | visual_init | file name for visualization of input tree |
sr2 myprefix_vr.plt | visual_red | file name for visualization of reduced/constructed tree |
sr2 myprefix_plot.plt | plot_scen | file name for visualization of scenario process |
sr2 myprefix_raw.dat | out_scen | file name for scenario data output in fan format |
sr2 myprefix_tree.dat | out_tree | file name for scenario data output in tree format |
The first three files (log_file
, input_gdx
and output_gdx
) are always used. The only optional input file sr_option is read by ScenRed if ScenRedParms('sroption')=1
. When you create this file, make sure to use the proper file name. The output files are created if the corresponding option is set to 1 in ScenRedParms
, e.g. ScenRedParms('out_tree')=1
.
Option File
The option file is optional and can be used to define more specific options depending on what kind of method (scenario reduction or scenario construction) is used. A list of options together with examples are given below for both the scenario reduction and the scenario construction devices (see Sections Scenario Reduction and Scenario Tree Construction, respectively). As mentioned above, the option file is read if ScenRedParms('sroption')=1
. The name of the file needs to follow the naming scheme.
Scenario Reduction
The application of the reduction algorithms contained in ScenRed can be individualized by configuring certain options. The most important one is given by the option metric_type
which allows to control the reduction process by different type of probability distances. Altogether three distances can be selected (see Table 3 below). All probability distances are associated with a special order specification which can be set by the option order
.
Table 3: Options for Scenario Reduction
Option | Description |
---|---|
metric_type | 1 - Transport (default), 2 - Fortet-Mourier, 3 - Wasserstein |
p_norm | choice of norm (example: 0 - max, 1 - sum, 2 - Euclidian) |
scaling | 0 - scaling off, 1 - scaling on (default) |
order | metric order (integer, default is 1) |
Example:
For example, a valid scenario reduction setup is the following:
ScenRedParms statements:
ScenRedParms('reduction_method') = 1;
ScenRedParms('red_percentage') = 0.3;
ScenRedParms('sroption') = 1;
in combination with an option file:
* ScenRed option file metric_type 2 order 2 p_norm 1 scaling 0
Lines starting with an asterisk are comments.
Scenario Tree Construction
A couple of options are offered to control the tree construction process. Note that in some cases due to sensibility of certain parameters some tuning is indispensable for producing good results.
Table 4 Options (basic) for Scenario Tree Construction
Option | Description |
---|---|
first_branch | time period of first branch (integer) |
eps_growth | 1 - linear, 2 - exponential |
eps_evolution | tree structure parameter (from 0.0 to 1.0) |
scaling | 0 - scaling off, 1 - scaling on (default) |
order | order of metric |
Table 4 above displays the main options to control the tree construction process. They are very similar to the reduction options.
The role of the parameter red_percentage
in ScenRedParms is to prescribe a total epsilon accuracy (level) for the approximation scheme. However, the approximation scheme is based on stagewise approximations which requires a splitting of the total level to the stages. Two strategies are offered by ScenRed, a linear and an exponential mapping of the total level to the intermediate levels. Use option eps_growth
to select one of them. Both strategies allow a second tuning parameter eps_evolution
which effects the slope of the epsilon splitting.
Even though this kind of control may generate good results for many applications, sometimes a more individual control can be needed. For example, some applications require a localization of branching stages. Moreover, to setup approximation bounds directly to stages can be very useful. To this end the standard options are extended by a new section environment.
Additional options - The section environment
An alternative control for the accurate constructions is provided by using the section environment. The section environment aims to establish a better monitoring of the construction process. There are overall three section types supported with the same syntax.
Branching control:
This section environment allows to specify branching points, i.e., an explicit selection of stages serving for branching. For example, use
section branching 2 4 6 end
to allow branching only at time period 2, 4, and 6. Note that each stage statement must be placed in one line. But stages can be merged. A shorter formulation of the same contents can be written in closed form
section branching 2*6 2 end
This statement reads branching within time periods from period 2 to period 6 with increment 2 steps. Both assignments can be combined and should be used together with the red_percentage
parameter in ScenRedParms.
Epsilon control:
In the epsilon section it is possible to assign epsilon tolerances for the stage approximations explicitly. This environment overcomes difficulties at times coming across with the automatic epsilon control. Note that this environment disables the parameter red_percentage
in ScenRedParms. For example, use
section epsilon 2 0.04 3*4 0.03 5 0.02 6 0.01 end
to control the approximation scheme by assigning different epsilon values per stage. Note that the value 0.03
is assigned to time period 3
and 4
in the example.
Node control:
The node control is the most specific control you have over the tree construction. With this environment the number of nodes of the tree which will be generated can be assigned for each time stage explicitly. For example, use
section num_nodes 1 1 2*3 5 4*5 10 6 15 end
The syntax is the same as before. Note that only one section environment can be use at once. In particular, only the first section environment detected in the option file is used. The section environment can be out commented like a standard option too.
Experimental option:
There is one other useful option to speed up computations when building different scenario trees from exactly the same input data. In this case the scenario distances needed to compute the trees could be saved to a external file at the first run and reloaded at later runs. Hence, the distances must be calculated only once. For example, use the option
write_distance dist.sr2
to save the computed distances to the file 'dist.sr2'. To reload them at next run use the option
read_distance dist.sr2
The option is classified as experimental since no validation of the input file takes place. Before using this option, please ensure that the distances loaded with the read_distance
option are the right ones.
Example:
Finally, look at the following example to see a valid combination of option file and ScenRedParms which can be passed to the scenario tree construction:
ScenRedParms('construction_method') = 2;
ScenRedParms('reduction_method') = 1;
ScenRedParms('sroption') = 1;
* tree construction option file order 1 scaling 0 section epsilon 2*4 0.1 5 0.2 6 0.1 end
Examples
Small example problems have been included to the GAMS Model Library. The implementation can be found in the GAMS programs srtree and srpchase. srtree converts a fan format scenario representation into a tree format representation and then reduces the tree size. srpchase is to solve a simple stochastic purchase problem involving three stages. Sample scenarios which are generated from a fixed distribution using a random value generator serve as input for the tree construction.
Output
The output of using ScenRed is a GDX (GAMS Data Exchange) file that contains the reduced scenario tree and the ScenRedReport
parameter.
The first data element in the output file is the ScenRedReport
parameter containing scalar outputs and statistics. The elements of this parameter are summarized in Table 5. The second data element is the parameter containing the probabilities of the nodes in the reduced scenario tree. These node probabilities are required to construct the reduced tree. The third and final data element is the ancestor map for the reduced scenario tree. This map can be read from the GDX file, or the reduced tree can be built from the original one by using the reduced probabilities. The content of the data output file is summarized in Table 6.
Table 5 ScenRedReport elements
Element | Description |
---|---|
ScenRedWarnings | number of warnings |
ScenRedErrors | number of errors |
run_time | running time in sec. |
orig_nodes | number of nodes in the initial scenario tree |
orig_leaves | number of leaves (scenarios) in the initial scenario tree |
red_nodes | number of nodes in the reduced scenario tree |
red_leaves | number of leaves(scenarios) in the reduced tree |
red_percentage | relative distance of initial and reduced scenario tree |
red_absolute | absolute distance between initial and reduced scenario tree |
reduction_method | reduction method used: 0: the program stopped before it could select a method 1: Forward method 2: Backward method |
construction_method | construction method used (tree construction) |
Table 6 Content of the Output File
No. | Symbol | Type | Dimension | Content |
---|---|---|---|---|
1 | ScenRedReport | Parameter | 1 | ScenRed report |
2 | red_prob | Parameter | 1 | node probabilities for the reduced scenarios |
3 | red_ancestor | Set | 2 | the ancestor map for the reduced scenarios |
To read the output file, the GAMS execute_load statement is used. This statement is used to transfer GDX data to GAMS at execution time. As an example, to read a GDX file named srrun_out.gdx
created, you might have the following statement:
execute_load 'srrun_out.gdx', ScenRedReport, sprob=red_prob, sanc=red_ancestor;
In the statement above, the equal sign =
is used to indicate that the data in the GDX parameter red_prob
should be read into the GAMS parameter sprob
, and the data in the GDX set red_ancestor
should be read into the GAMS set sanc
.
Assuming your model is formulated to use a subtree of the original, you can now continue with the solve and any subsequent reporting.
Visualization
In this section an easy way for making plots of scenario processes and tree structures is described. To this end you need the free Gnuplot software or any other plotting software which allows plotting directly from simple data files.
The concept of plotting tasks is the following. For each plot two files are generated, a Gnuplot access file (name.plt) and a raw data file (name.dat). The access file contains basic Gnuplot options and it can be adjusted for individual liking afterwards. The default output is the display. The supported plotting commands are
visual_init, visual_red, plot_scen
for plotting the initial tree structure, the reduced/constructed tree structure, and the scenario process(es), respectively.
Example:
For example, to visualize the constructed tree activate the ScenRedParms parameter
ScenRedParms('visual_red') = 1;
in the GAMS program. Depending on the name of the first argument in the ScenRed libInclude statement (myprefix in the example above), the results are the output files 'myprefix.plt' and 'myprefix.dat'. To compute the picture now you simply open the file 'myprefix.plt' with Gnuplot from the directory, where both output files are located (that should be the working directory). Alternatively, from the command line prompt call
>gnuplot myprefix.plt
Gnuplot will automatically generate the picture. Feel free to change any option in the Gnuplot access file for individual requirements. See also the Gnuplot manual for more details. In particular, to compute a well-scaled encapsulated postscript picture (eps), you simply have to uncomment a few lines in the Gnuplot option file above and to open it with Gnuplot once again.
With the parameter plot_scen
the scenario process(es) can be visualized. Note that Gnuplot access and data files according to the number of random values are generated.
Data Export
After computing the scenario reduction or scenario tree construction, the scenario data can be exported to external data files. For this, two output parameters out_tree
and out_scen
can be used. These options generate data files according to the tree and fan format, respectively. The name of the data files follow the naming scheme.
Diagnostic Check of Scenario Trees
When the input data is read, a number of checks are performed to verify that the data is correct. These include:
- Consistency of the desired input parameters with the content of the user input (number of nodes, number of leaves, number of time steps, number of random values assigned to a node).
- Range check of desired input parameters and options.
- Check of scenario and node probabilities.
- Check of the ancestor matrix (check the orientation of the graph, check if the graph contains a cycle, check if the graph contains incomplete forests or scenarios, check the consistency of the parameter
num_time_steps
with the ancestor matrix).
The following errors in the specification of the scenario tree cause the program to skip the reduction algorithms:
- The input files cannot be opened.
- Not all required input parameters are given.
- The required input parameters are not consistent with the user input.
- The required input parameters are out of range.
- Missing or negative scenario probabilities (probabilities of leaves).
- The ancestor set contains too many entries (more than
2*num_nodes
). - A cycle in the ancestor set.
- Incomplete scenarios in the ancestor set.
- Run time limit reached.
Errors and Error Numbers
If a serious error occurs, an error message is displayed in the log file. These messages always begin with
**** SCENRED run-time error ...
The number of errors are contained in the parameter ScenRedReport
of the output file (if it could be created). The occurrence of an error can also be detected from the last ScenRed-related line in the log:
**** SCENRED ErrCode=...
The numerical values of ErrCode
and their meaning are given below.
ErrCode | Meaning |
---|---|
1 | (for internal use) |
2 | fatal error while reading from input file |
3 | fatal error while writing to output file |
4 | fatal error while reading from option file |
5 | log file cannot be opened |
6 | a memory allocation error occurred |
7 | there are missing input parameters |
8 | could not access the GAMS names for the nodes |
9 | (for internal use) |
10 | ancestor set not given or contains too many entries |
11 | node probabilities cannot be not read or are wrong |
12 | random values for the nodes cannot be read |
13 | input parameters are out of range |
14 | ancestor set contains a cycle |
15 | incomplete scenarios or forests detected |
16 | fatal error in reduction algorithm (not enough memory) |
17 | running time limit reached |
Warnings
Warnings are caused by misspecification of the initial scenarios that can be possibly fixed. When such an error is encountered in the input files or in the scenario tree, a message is displayed in the log file. These messages always start with
**** SCENRED Warning ...
The following list gives an overview of the cases that produce warnings, and the action taken by ScenRed in these cases.
- The user assigned an option value that is out of range.
Action: Assign the default value. - Both parameters
red_num_leaves
andred_percentage
are assigned nontrivial values.
Action: The value ofred_percentage
will be ignored. - The scenario probabilities (probabilities of leaves) do not sum up to 1.
Action: The scenario probabilities are rescaled. Assign new probabilities to the remaining (inner) nodes that are consistent with the scenario probabilities. - Missing probabilities of inner nodes.
Action: Assign node probabilities that are consistent with the scenario probabilities. - The ancestor set contains more than one ancestor for a node.
Action: It is assumed that a successor set is given instead of an ancestor set (i.e., the transpose of an ancestor matrix. This means that the graph corresponding to the ancestor set has the wrong orientation). ScenRed starts the tree diagnostic for the successor set. The reduced tree will be defined in terms of a successor set as well (if the successor set passes the tree diagnostic and if ScenRed locates no fatal error during the run).