Transfer
Note
This feature is currently in beta status.

GAMS transfer is a tool to maintain GAMS data outside a GAMS script in a programming language like Python or Matlab. It allows the user to add GAMS symbols (Sets, Aliases, Parameters, Variables and Equations), to manipulate GAMS symbols, as well as read/write symbols to different data endpoints. Transfer’s main focus is the highly efficient transfer of data between GAMS and the target programming language, while keeping those operations as simple as possible for the user. In order to achieve this, symbol records – the actual and potentially large-scale data sets – are stored in native data structures of the corresponding programming languages. The benefits of this approach are threefold: (1) The user is usually very familiar with these data structures, (2) these data structures come with a large tool box for various data operations, and (3) optimized methods for reading from and writing to GAMS can transfer the data as a bulk – resulting in the high performance of this package. This documentation describes, in detail, the use of Transfer within a Python environment.

Data within Transfer will be stored as Pandas DataFrame. The flexible nature of Pandas DataFrames makes them ideal for storing/manipulating sparse data. Pandas includes advanced operations for indexing and slicing, reshaping, merging and even visualization.

Pandas also includes a number of advanced data I/O tools that allow users to generate DataFrames directly from CSV (.csv), JSON (.json), HTML (.html), Microsoft Excel (.xls, .xlsx), SQL , pickle (.pkl), SPSS (.sav, .zsav), SAS (.xpt, .sas7bdat), etc.

Centering Transfer around the Pandas DataFrame gives GAMS users (on a variety of platforms – macOS, Windows, Linux) access to tools to move data back and forth between their favorite environments for use in their GAMS models.

The goal of this documentation is to introduce the user to Transfer and its functionality. This documentation is not designed to teach the user how to effectively manipulate Pandas DataFrames; users seeking a deeper understanding of Pandas are referred to the extensive documentation.

Getting Started

Recommended Import

Users can access the GAMS transfer sub-module with either of the following (equivalent) import statements once the GAMS API has been installed:

>>> import gams.transfer as gt

>>> from gams import transfer as gt

Design

Storing, manipulating, and transforming sparse data requires that it lives within an environment – this data can then be linked together to enable various operations. In Transfer we refer to this "environment" as the Container, it is the main repository for storing and linking our sparse data. Symbols can be added to the Container from a variety of GAMS starting points but they can also be generated directly within the Python environment using convenient function calls that are part of the Transfer package; a symbol can only belong to one container at a time.

The process of linking symbols together within a container was inspired by typical GAMS workflows but leverages aspects of object oriented programming to make linking data a natural process. Linking data enables data operations like implicit set growth, domain checking, data format transformations (to dense/sparse matrix formats), etc – all of these features are enabled by the use of ordered pandas.CategoricalDtype data types. All of these details will be discussed in the following sections.

Naming Conventions

Methods – functions that operate on a object – are all verbs (i.e., getMaxAbsValue(), getUELs(), etc.) and use camel case for identification purposes. Methods are, by convention, tools that "do things"; that is they involve some, potentially expensive, computations. Some Transfer methods accept arguments, while others are simply called using the () notation. Plural arguments (columns) hint that they can accept lists of inputs (i.e., a list of symbol names) while singular arguments (column) will only accept one input at a time.

Properties – inherent attributes of an object – are all nouns (i.e., name, number_records, etc.) and use snake case (lower case words separated by underscores) for identification purposes. Object properties (or "object attributes") are fundamental to the object and therefore they are not called like methods; object properties are simply accessed by other methods or user calls. By convention, properties only require trival amounts of computation to access.

Classes – the basic structure of an object – are all singular nouns and use camel case (starting with a capital first letter) for identification purposes.

Install

The user must download and install the latest version of GAMS in order to install Transfer. Transfer is installed when the GAMS Python API is built and installed. The user is referred HERE for instructions on how to install the Python API files. Transfer and all GAMS Python API files are compatible with environment managers such as Anaconda.

Examples

GDX Read

Reading in all symbols can be accomplished with one line of code (we reference data from the `trnsport.gms` example).

import gams.transfer as gt
m = gt.Container("trnsport.gdx")

All symbol data is organized in the data attribute – m.data[<symbol_name>].records (the Container is also subscriptable, m[<symbol_name>].records is an equivalent statement) – records are stored as Pandas DataFrames.

Write Symbol to CSV

Writing symbol records to a CSV can also be accomplished with one line.

m["x"].records.to_csv("x.csv")

Write a New GDX

There are five symbol classes within Transfer: 1) Sets, 2) Parameters, 3) Variables, 4) Equations and 5) Aliases. For purposes of this quick start, we show how to recreate the distance data structure from the `trnsport.gms` model (the parameter d). This brief example shows how users can achieve "GAMS-like" functionality, but within a Python environment – Transfer leverages the object oriented programming to simplify syntax.

import gams.transfer as gt
import pandas as pd
m = gt.Container()
# create the sets i, j
i = gt.Set(m, "i", records=["seattle", "san-diego"], description="supply")
j = gt.Set(m, "j", records=["new-york", "chicago", "topeka"], description="markets")
# add "d" parameter -- domain linked to set objects i and j
d = gt.Parameter(m, "d", [i, j], description="distance in thousands of miles")
# create some data as a generic DataFrame
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
# setRecords will automatically convert the dist DataFrame into a standard DataFrame format
d.setRecords(dist)
# write the GDX
m.write("out.gdx")

This example shows a few fundamental features of Transfer:

  1. An empty Container is analogous to an empty GDX file
  2. Symbols will always be linked to a Container (notice that we always pass the Container reference m to the symbol constructor)
  3. Records can be added to a symbol with the setRecords() method or through the records constructor argument (internally calls setRecords()). Transfer will convert many common Python data structures into a standard format.
  4. Domain linking is possible by passing domain set objects to other symbols
  5. Writing a GDX file can be accomplished in one line with the write() method.

Full Example

It is possible to use everything we now know about Transfer to recreate the `trnsport.gms` results in GDX form. As part of this example we also introduce the write method (and generate new.gdx). We will discuss it in more detail in the following section: GDX Read/Write.

import gams.transfer as gt
# create an empty Container object
m = gt.Container()
# add sets
i = gt.Set(m, "i", records=["seattle", "san-diego"], description="supply")
j = gt.Set(m, "j", records=["new-york", "chicago", "topeka"], description="markets")
# add parameters
a = gt.Parameter(m, "a", ["*"], description="capacity of plant i in cases")
b = gt.Parameter(m, "b", j, description="demand at market j in cases")
d = gt.Parameter(m, "d", [i, j], description="distance in thousands of miles")
f = gt.Parameter(
m, "f", records=90, description="freight in dollars per case per thousand miles"
)
c = gt.Parameter(
m, "c", [i, j], description="transport cost in thousands of dollars per case"
)
# set parameter records
cap = pd.DataFrame([("seattle", 350), ("san-diego", 600)], columns=["plant", "n_cases"])
a.setRecords(cap)
dem = pd.DataFrame(
[("new-york", 325), ("chicago", 300), ("topeka", 275)],
columns=["market", "n_cases"],
)
b.setRecords(dem)
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
d.setRecords(dist)
# c(i,j) = f * d(i,j) / 1000;
cost = d.records.copy(deep=True)
cost["value"] = f.records.loc[0, "value"] * cost["value"] / 1000
c.setRecords(cost)
# add variables
q = pd.DataFrame(
[
("seattle", "new-york", 50, 0),
("seattle", "chicago", 300, 0),
("seattle", "topeka", 0, 0.036),
("san-diego", "new-york", 275, 0),
("san-diego", "chicago", 0, 0.009),
("san-diego", "topeka", 275, 0),
],
columns=["from", "to", "level", "marginal"],
)
x = gt.Variable(
m, "x", "positive", [i, j], records=q, description="shipment quantities in cases",
)
z = gt.Variable(
m,
"z",
records=pd.DataFrame(data=[153.675], columns=["level"]),
description="total transportation costs in thousands of dollars",
)
# add equations
cost = gt.Equation(m, "cost", "eq", description="define objective function")
supply = gt.Equation(m, "supply", "leq", i, description="observe supply limit at plant i")
demand = gt.Equation(m, "demand", "geq", j, description="satisfy demand at market j")
# set equation records
cost.setRecords(
pd.DataFrame(data=[[0, 1, 0, 0]], columns=["level", "marginal", "lower", "upper"])
)
supplies = pd.DataFrame(
[
("seattle", 350, "eps", float("-inf"), 350),
("san-diego", 550, 0, float("-inf"), 600),
],
columns=["from", "level", "marginal", "lower", "upper"],
)
supply.setRecords(supplies)
demands = pd.DataFrame(
[
("new-york", 325, 0.225, 325),
("chicago", 300, 0.153, 300),
("topeka", 275, 0.126, 275),
],
columns=["from", "level", "marginal", "lower"],
)
demand.setRecords(demands)
m.write("new.gdx")

Extended Examples

Get HTML data

import gams.transfer as gt
import pandas as pd
url = "https://www.fdic.gov/resources/resolutions/bank-failures/failed-bank-list"
dfs = pd.read_html(url)
# pandas will create a list of dataframes depending on the target URL, we just need the first one
df = dfs[0]
m = gt.Container()
b = gt.Set(m, "b", ["*"], records=df["Bank NameBank"].unique(), description="Bank Name")
s = gt.Set(
m,
"s",
["*"],
records=df["StateSt"].sort_values().unique(),
description="States (alphabetical order)",
)
c = gt.Set(
m,
"c",
["*"],
records=df["CityCity"].sort_values().unique(),
description="Cities (alphabetical order)",
)
c_to_s = gt.Set(
m,
"c_to_s",
[c, s],
records=df[["CityCity", "StateSt"]]
.drop_duplicates()
.sort_values(by=["StateSt", "CityCity"]),
description="City/State pair",
)
bf = gt.Parameter(
m,
"bf",
b,
records=df[["Bank NameBank", "FundFund"]]
.drop_duplicates(subset="Bank NameBank")
.sort_values(by=["Bank NameBank"]),
description="Bank Namd & Fund #",
)
In [1]: m.isValid()
Out[1]: True
Note
Users can chain Pandas operations together and pass those operations through to the records argument or the setRecords method.

Get PostgreSQL data (w/ sqlalchemy)

import gams.transfer as gt
from sqlalchemy import create_engine
import pandas as pd
# connect to postgres (assuming a localhost)
engine = create_engine("postgresql://localhost:5432/" + <database_name>)
df = pd.read_sql(<sql_table_name>, con=engine, index_col=0)
# create the Container and add symbol
m = Container()
p = Parameter(m, <sql_table_name>)
# we need to figure out the symbol dimensionality (potentially from the shape of the dataframe)
r, c = df.shape
p.dimension = c - 1
# set the records
p.setRecords(df)
# write out the GDX file
m.write("out.gdx")

Main Classes

Container

The main object class within Transfer is called Container. The Container is the vessel that allows symbols to be linked together (through their domain definitions), it enables implicit set definitions, it enables structural manipulations of the data (matrix generation), and it allows the user to perform different read/write operations.

Constructor

Constructor Arguments
Argument Type Description Required Default
load_from str, GMD Object Handle, GamsDatabase Object, ConstContainer Points to the source of the data being read into the Container No None
system_directory str Absolute path to GAMS system directory No Attempts to find the GAMS installation by creating a GamsWorkspace object and loading the system_directory attribute.

Creating a Container is a simple matter of initializing an object. For example:

import gams.transfer as gt
m = gt.Container()

This new Container object, here called m, contains a number of convenient properties and methods that allow the user to interact with the symbols that are in the Container. Some of these methods are used to filter out different types of symbols, other methods are used to numerically characterize the data within each symbol.

Properties

Property Description Type Special Setter Behavior
data main dictionary that is used to store all symbol data (case preserving) CasePreservingDict -
modified Flag that identifies if the Container has been modified in some way. Container.modifed=False will reset this flag for all symbols in the container as well as the container itself. bool -

Symbols are organized in the Container under the data Container attribute. The dot notation (m.data) is used to access the underlying dictionary. Symbols in this dictionary can then be retrieved with the standard bracket notation (m.data[<symbol_name>]). The Container is also subscriptable (i.e., m["i"] will return the i Set object just as if the user called m.data["i"]). The behavior of the data dictionary is has been customized to be case-insensitive (which mimics the behavior of GAMS) – m["i"] and m["I"] will return the same object.

In [1]: m.data
Out[1]:
{'i': <src.gamstransfer.Set at 0x7fa2387750a0>,
'j': <src.gamstransfer.Set at 0x7fa238e74fa0>,
'a': <src.gamstransfer.Parameter at 0x7fa238e74cd0>,
'b': <src.gamstransfer.Parameter at 0x7fa238e746a0>,
'd': <src.gamstransfer.Parameter at 0x7fa23876b370>,
'f': <src.gamstransfer.Parameter at 0x7fa23876b400>,
'c': <src.gamstransfer.Parameter at 0x7fa23876b5e0>,
'x': <src.gamstransfer.Variable at 0x7fa23876b340>,
'z': <src.gamstransfer.Variable at 0x7fa23876b640>,
'cost': <src.gamstransfer.Equation at 0x7fa23876b2b0>,
'supply': <src.gamstransfer.Equation at 0x7fa23876b310>,
'demand': <src.gamstransfer.Equation at 0x7fa23876b460>}

Symbol existance in the Container can be tested with an overloaded Python in operator. The following (case-insensitive) syntax is possible:

In [1]: 'i' in m
Out[1]: True
In [2]: 'I' in m
Out[2]: True
In [3]: i in m
Out[3]: True
Note
The final example assumes the existance of a separate symbol object called i.

Methods

Method Description Arguments/Defaults Returns
addAlias Container method to add an Alias name (str)
alias_with (Set, Alias)
Alias object
addUniverseAlias Container method to add a UniverseAlias name (str) UniverseAlias object
addEquation Container method to add an Equation name (str)
type (str)
domain=[] (str, list)
records=None (pandas.DataFrame, numpy.ndarry, None)
domain_forwarding=False (bool)
description="" (str)
Equation object
addParameter Container method to add a Parameter name (str)
domain=None (str, list, None)
records=None (pandas.DataFrame, numpy.ndarry, None)
domain_forwarding=False (bool)
description="" (str)
Parameter object
addSet Container method to add a Set name (str)
domain=None (str, list, None)
is_singleton=False (bool)
records=None (pandas.DataFrame, numpy.ndarry, None)
domain_forwarding=False (bool)
description="" (str)
Set object
addVariable Container method to add an Variable name (str)
type="free" (str)
domain=[] (str, list)
records=None (pandas.DataFrame, numpy.ndarry, None)
domain_forwarding=False (bool)
description="" (str)
Variable object
describeAliases create a summary table with descriptive statistics for Aliases symbols=None (None, str, list) - if None, assumes all aliases pandas.DataFrame
describeParameters create a summary table with descriptive statistics for Parameters symbols=None (None, str, list) - if None, assumes all parameters pandas.DataFrame
describEquations create a summary table with descriptive statistics for Equations symbols=None (None, str, list) - if None, assumes all equations pandas.DataFrame
describeSets create a summary table with descriptive statistics for Sets symbols=None (None, str, list) - if None, assumes all sets pandas.DataFrame
describeVariables create a summary table with descriptive statistics for Variables symbols=None (None, str, list) - if None, assumes all variables pandas.DataFrame
getDomainViolations gets domain violations that exist in the data; returns a list of DomainViolation objects (or None if no violations) - list or None
hasDomainViolations returns True if there are domain violations in the records, returns False if not. - bool
countDomainViolations get the count of how many records contain at least one domain violation for all symbols in the Container - dict
dropDomainViolations drop records that have domain violations for all symbols in the Container - None
hasDuplicateRecords returns True if there are any symbols with duplicate records, False if not. - bool
countDuplicateRecords returns the count of how many duplicate records exist - dict
dropDuplicateRecords drop records with duplicate domains from all symbols in the Container – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
renameUELs renames UELs (case-sensitive) that appear in symbols (for all dimensions). If symbols=None, rename UELs in all symbols. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (dict)
symbols=None (str, list, None)
allow_merge=False (bool)
None
getUELs gets UELs from all symbols. If symbols=None and ignore_unused=False, return the full universe set. If symbols=None and ignore_unused=True, return a universe set that contains UELs that only appear in data. symbols=None (str, list, None)
ignore_unused=False (bool)
list
removeUELs removes UELs from all symbols in all dimensions. If uels is None only unused UELs will be removed. If symbols is None UELs will be removed from all symbols. uels (str, list, None)
symbols=None (str, list, None)
None
getSymbols returns a list of object refernces for symbols symbols (str, list) list
getUniverseSet **Deprecated, use getUELs() instead** provides a universe for all symbols, the symbols argument allows Transfer to create a partial universe if writing only a subset of symbols (currently only supported when writing to GamsDatabases or GMD Objects) symbols=None (None, str, list) list
isValid True if all symbols in the Container are valid - bool
listAliases list all aliases (is_valid=None), list all valid aliases (is_valid=True), list all invalid aliases (is_valid=False) in the container is_valid=None (bool, None) list
listEquations list all equations (is_valid=None), list all valid equations (is_valid=True), list all invalid equations (is_valid=False) in the container is_valid=None (bool, None)
types=None (list of equation types) - if None, assumes all types
list
listParameters list all parameters (is_valid=None), list all valid parameters (is_valid=True), list all invalid parameters (is_valid=False) in the container is_valid=None (bool, None) list
listSets list all sets (is_valid=None), list all valid sets (is_valid=True), list all invalid sets (is_valid=False) in the container is_valid=None (bool, None) list
listSymbols list all symbols (is_valid=None), list all valid symbols (is_valid=True), list all invalid symbols (is_valid=False) in the container is_valid=None (bool, None) list
listVariables list all variables (is_valid=None), list all valid variables (is_valid=True), list all invalid variables (is_valid=False) in the container is_valid=None (bool, None)
types=None (list of variable types) - if None, assumes all types
list
read main method to read load_from, can be provided with a list of symbols to read in subsets, records controls if symbol records are loaded or just metadata load_from (str,GMD Object Handle,GamsDatabase Object,ConstContainer)
symbols="all" (str, list)
records=True (bool)
None
removeSymbols symbols to remove from the Container, also sets the symbols ref_container to None symbols (str, list) None
renameSymbol rename a symbol in the Container old_name (str), new_name (str) None
reorderSymbols reorder symbols in order to avoid domain violations - None
write main bulk write method to a write_to target write_to (str, GamsDatabase,GMD Object)
symbols=None (None, str, list) - if None, assumes all symbols
compress=False (bool)
uel_priority=None (str, list)
merge_symbols=None (None, str, list)
None

Set

There are two different ways to create a GAMS set and add it to a Container.

  1. Use Set constructor
  2. Use the Container method addSet (which internally calls the Set constructor)

Constructor

Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
domain list List of domains given either as string ('*' for universe set) or as reference to a Set/Alias object No ["*"]
is_singleton bool Indicates if set is a singleton set (True) or not (False) No False
records many Symbol records No None
domain_forwarding bool Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) No False
description str Description of symbol No ""

Properties

Property Description Type Special Setter Behavior
description description of symbol str -
dimension dimension of symbol int setting is a shorthand notation to create ["*"] * n domains in symbol
domain_forwarding flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) bool no effect after records have been set
domain list of domains given either as string (* for universe set) or as reference to the Set/Alias object list -
domain_labels column headings for the records DataFrame list of str -
domain_names string version of domain names list of str -
domain_type none, relaxed or regular depending on state of domain links str -
is_singleton bool if symbol is a singleton set bool -
modified Flag that identifies if the Set has been modified bool -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(self.records) if not None) int -
records the main symbol records pandas.DataFrame responsive to domain_forwarding state
ref_container reference to the Container that the symbol belongs to Container -
summary output a dict of only the metadata dict -

Methods

Method Description Arguments/Defaults Returns
equals Used to compare the symbol to another symbol. If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_element_text=True then check that all set elements have the same descriptive element text, otherwise skip. If check_meta_data=True then check that symbol name and description are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) are ignored for set symbols. If verbose=True will return an exception from the asserter describing the nature of the difference. columns (ignored)
check_uels=True (bool)
check_element_text=True (bool)
check_meta_data=True (bool)
rtol=0.0 (ignored)
atol=0.0 (ignored)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for sets. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
addUELs adds UELs to the symbol dimensions. If dimensions is None then add UELs to all dimensions. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
None
getUELs gets UELs from symbol dimensions. If dimensions is None then get UELs from all dimensions (maintains order). The argument codes accepts a list of str UELs and will return the corresponding int; must specify a single dimension if passing codes. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. dimensions=None (int, list, None)
codes=None (int, list, None)
ignore_unused=False (bool)
list
setUELs set the UELs for symbol dimensions. If dimensions is None then set UELs for all dimensions. If rename=True, then the old UEL names will be renamed with the new UEL names. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
rename=False (bool)
None
removeUELs removes UELs that appear in the symbol dimensions, If uels is None then remove all unused UELs (categories). If dimensions is None then operate on all dimensions. uels=None (str, list, None)
dimensions=None (int, list, None)
bool
renameUELs renames UELs (case-sensitive) that appear in the symbol dimensions. If dimensions is None then operate on all dimensions of the symbol. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (str, list, dict)
dimensions (int, list, None)
allow_merge=False (bool)
None
reorderUELs reorders the UELs in the symbol dimensions. If dimensions is None then reorder UELs in all dimensions of the symbol. uels (str, list, dict)
dimensions (int, list, None)
None
hasDomainViolations returns True if there are domain violations in the records, returns False if not. - bool
countDomainViolations returns the count of how many records contain at least one domain violation - int
dropDomainViolations drop records from the symbol that contain a domain violation - None
getDomainViolations returns a list of DomainViolation objects if any (None otherwise) - list or None
findDomainViolations get a view of the records DataFrame that contain any domain violations - pandas.DataFrame
hasDuplicateRecords returns True if there are (case insensitive) duplicate records in the symbol, returns False if not. - bool
countDuplicateRecords returns the count of how many (case insensitive) duplicate records exist - int
dropDuplicateRecords drop records with (case insensitive) duplicate domains from the symbol – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
findDuplicateRecords get a view of the records DataFrame that contain any (case insensitive) duplicate domains – keep argument can take values of "first" (finds all duplicates while keeping the first instance as unique), "last" (finds all duplicates while keeping the last instance as unique), or False (finds all duplicates) keep="first" pandas.DataFrame
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, recheck a symbol if force=True verbose=False
force=True
bool
setRecords main convenience method to set standard pandas.DataFrame formatted records records (many types) None
generateRecords convenience method to set standard pandas.DataFrame formatted records given domain set information. Will generate records with the Cartesian product of all domain sets. The densities argument can take any value on the interval [0,1]. If densities is <1 then randomly selected records will be removed. `densities` will accept a `list` of length `dimension` -- allows users to specify a density per symbol dimension. Random number state can be set with `seed` argument. densities=1.0 (float, list)
seed=None (int, None)
None

Adding Set Records

Three possibilities exist to assign symbol records to a set (roughly ordered in complexity):

  1. Setting the argument records in the set constructor/container method (internally calls setRecords) - creates a data copy
  2. Using the symbol method setRecords - creates a data copy
  3. Setting the property records directly - does not create a data copy

If the data is in a convenient format, a user may want to pass the records directly within the set constructor. This is an optional keyword argument and internally the set constructor will simply call the setRecords method. The symbol method setRecords is a convenience method that transforms the given data into an approved Pandas DataFrame format (see Standard Data Formats). Many native python data types can be easily transformed into DataFrames, so the setRecords method for Set objects will accept a number of different types for input. The setRecords method is called internally on any data structure that is passed through the records argument. We show a few examples of ways to create differently structured sets:

Example #1 - Create a 1D set from a list
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"])
# NOTE: the above syntax is equivalent to -
# i = gt.Set(m, "i")
# i.setRecords(["seattle", "san-diego"])
# NOTE: the above syntax is also equivalent to -
# m.addSet("i", records=["seattle", "san-diego"])
# NOTE: the above syntax is also equivalent to -
# i = m.addSet("i")
# i.setRecords(["seattle", "san-diego"])
# NOTE: the above syntax is also equivalent to -
# m.addSet("i")
# m["i"].setRecords(["seattle", "san-diego"])
In [1]: i.records
Out[1]:
uni_0 element_text
0 seattle
1 san-diego
Example #2 - Create a 1D set from a tuple
import gams.transfer as gt
m = gt.Container()
j = gt.Set(m, "j", records=("seattle", "san-diego"))
# NOTE: the above syntax is equivalent to -
# j = gt.Set(m, "j")
# j.setRecords(("seattle", "san-diego"))
# NOTE: the above syntax is also equivalent to -
# m.addSet("j", records=("seattle", "san-diego"))
# NOTE: the above syntax is also equivalent to -
# j = m.addSet("j")
# j.setRecords(("seattle", "san-diego"))
# NOTE: the above syntax is also equivalent to -
# m.addSet("j")
# m["j"].setRecords(("seattle", "san-diego"))
In [1]: j.records
Out[1]:
uni_0 element_text
0 seattle
1 san-diego
Example #3 - Create a 2D set from a list of tuples
import gams.transfer as gt
m = gt.Container()
k = gt.Set(m, "k", ["*", "*"], records=[("seattle", "san-diego")])
# NOTE: the above syntax is equivalent to -
# k = gt.Set(m, "k", ["*", "*"])
# k.setRecords([("seattle", "san-diego")])
# NOTE: the above syntax is also equivalent to -
# m.addSet("k", ["*","*"], records=[("seattle", "san-diego")])
# NOTE: the above syntax is also equivalent to -
# k = m.addSet("k", ["*","*"])
# k.setRecords([("seattle", "san-diego")])
# NOTE: the above syntax is also equivalent to -
# m.addSet("k")
# m["k"].setRecords([("seattle", "san-diego")])
In [1]: k.records
Out[1]:
uni_0 uni_1 element_text
0 seattle san-diego
Example #4 - Create a 1D set from a DataFrame slice + .unique()
import gams.transfer as gt
m = gt.Container()
# note that the raw data is convenient to hold in a DataFrame
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
l = gt.Set(m, "l", records=dist["from"].unique())
# NOTE: the above syntax is equivalent to -
# l = gt.Set(m, "l")
# l.setRecords(dist["from"].unique())
# NOTE: the above syntax is also equivalent to -
# m.addSet("l", records=dist["from"].unique())
# NOTE: the above syntax is also equivalent to -
# l = m.addSet("l")
# l.setRecords(dist["from"].unique())
# NOTE: the above syntax is also equivalent to -
# m.addSet("l")
# m["l"].setRecords(dist["from"].unique())
In [1]: l.records
Out[1]:
uni_0 element_text
0 seattle
1 san-diego
Note
The .unique() method preserves the order of appearance, unlike set().

Set element text is very handy when labeling specific set elements within a set. A user can add a set element text directly with a set element. Note that it is not required to label all set elements, as can be seen in the following example.

Example #5 - Add set element text
import gams.transfer as gt
m = gt.Container()
i = gt.Set(
m,
"i",
records=[
("seattle", "home of sub pop records"),
("san-diego",),
("washington_dc", "former gams hq"),
],
)
# NOTE: the above syntax is equivalent to -
#
# i = gt.Set(m, "i")
# i_recs = [
# ("seattle", "home of sub pop records"),
# ("san-diego",),
# ("washington_dc", "former gams hq"),
# ]
#
# i.setRecords(i_recs)
# NOTE: the above syntax is also equivalent to -
# m.addSet("i", records=i_recs)
# NOTE: the above syntax is also equivalent to -
# i = m.addSet("i")
# i.setRecords(i_recs)
# NOTE: the above syntax is also equivalent to -
# m.addSet("i")
# m["i"].setRecords(i_recs)
In [1]: i.records
Out[1]:
uni_0 element_text
0 seattle home of sub pop records
1 san-diego
2 washington_dc former gams hq

Directly Set Records

The primary advantage of the setRecords method is that Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.

Example #1 - Directly set records (1D set)
import gams.transfer as gt
import pandas as pd
m = gt.Container()
i = gt.Set(m, "i", description="supply")
# create a standard format dataframe
df_i = pd.DataFrame(
data=[("seattle", ""), ("san-diego", "")], columns=["uni_0", "element_text"]
)
# need to create categorical column type, referencing elements already in df_i
df_i["uni_0"] = df_i["uni_0"].astype(
pd.CategoricalDtype(categories=df_i["uni_0"].unique(), ordered=True)
)
# set the records directly
i.records = df_i
In [1]: i.isValid()
Out[1]: True

Stepping through this example we take the following steps:

  1. Create an empty Container
  2. Create a GAMS set i in the Container, but do not set the records
  3. Create a Pandas DataFrame (manually, in this example) taking care to follow the standard format
  4. The DataFrame has the right shape and column labels so we can proceed to set the records.
  5. We need to cast the uni_0 column as a categorical data type, so we create a custom ordered categorty type using pandas.CategoricalDtype
  6. Finally, we set the records directly by passing a reference to df_i into the symbol records attribute. The setter function of .records checks that a DataFrame is being set, but does not check validity. Thus, as a final step we call the .isValid() method to verify that the symbol is valid.
Attention
Users can debug their DataFrames by running <symbol_name>.isValid(verbose=True) to get feedback about their data.
Example #2 - Directly set records (1D subset)
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"], description="supply")
j = gt.Set(m, "j", i, description="supply")
# create a standard format dataframe
df_j = pd.DataFrame(data=[("seattle", "")], columns=["i_0", "element_text"])
# create the categorical column type
df_j["i_0"] = df_j["i_0"].astype(i.records["uni_0"].dtype)
# set the records
j.records = df_j
In [1]: j.isValid()
Out[1]: True

This example is more subtle in that we want to create a set j that is a subset of i. We create the set i using the setRecords method but then set the records directly for j. There are two important details to note: 1) the column labels in df_j now reflect the standard format for a symbol with a domain set (as opposed to the universe) and 2) we create the categorical dtype by referencing the parent set (i) for the categories (instead of referencing itself).

Generate Set Records

Generating the initial pandas.DataFrame object could be difficult for Set symbols that have a large number of records and a small number of UELs – these higher dimensional symbols will benefit from the generateRecords convenience function. Internally, generateRecords computes the dense Cartesian product of all the domain sets that define a symbol (generateRecords will only work on symbols where <symbol>.domain_type == "regular").

Example #1 - Create a large (dense) 4D set
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Set(m, "a", [i, j, k, l])
# generate the records
a.generateRecords()
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 element_text
0 i0 j0 k0 l0
1 i0 j0 k0 l1
2 i0 j0 k0 l2
3 i0 j0 k0 l3
4 i0 j0 k0 l4
... ... ... ... ... ...
6249995 i49 j49 k49 l45
6249996 i49 j49 k49 l46
6249997 i49 j49 k49 l47
6249998 i49 j49 k49 l48
6249999 i49 j49 k49 l49
[6250000 rows x 5 columns]

It is also possible to generate a sparse set (randomly selected rows are removed from the dense dataframe) with the densities argument to generateRecords.

Example #2 - Create a large (sparse) 4D set
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Set(m, "a", [i, j, k, l])
# generate the records
a.generateRecords(densities=0.05)
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 element_text
0 i0 j0 k1 l4
1 i0 j0 k1 l13
2 i0 j0 k1 l19
3 i0 j0 k1 l23
4 i0 j0 k2 l1
... ... ... ... ... ...
312495 i49 j49 k48 l27
312496 i49 j49 k48 l30
312497 i49 j49 k49 l7
312498 i49 j49 k49 l32
312499 i49 j49 k49 l42
[312500 rows x 5 columns]
Example #3 - Create a large 4D set w/ only 1 sparse dimension
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Set(m, "a", [i, j, k, l])
# generate the records
a.generateRecords(densities=[1, 0.05, 1, 1])
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 element_text
0 i0 j22 k0 l0
1 i0 j22 k0 l1
2 i0 j22 k0 l2
3 i0 j22 k0 l3
4 i0 j22 k0 l4
... ... ... ... ... ...
249995 i49 j36 k49 l45
249996 i49 j36 k49 l46
249997 i49 j36 k49 l47
249998 i49 j36 k49 l48
249999 i49 j36 k49 l49
[250000 rows x 5 columns]

Parameter

There are two different ways to create a GAMS parameter and add it to a Container.

  1. Use Parameter constructor
  2. Use the Container method addParameter (which internally calls the Parameter constructor)

Constructor

Constructor Arguments
Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
domain list List of domains given either as string ('*' for universe set) or as reference to a Set/Alias object, an empty domain list will create a scalar parameter No []
records many Symbol records No None
domain_forwarding bool Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) No False
description str Description of symbol No ""

Properties

Property Description Type Special Setter Behavior
description description of symbol str -
dimension dimension of symbol int setting is a shorthand notation to create ["*"] * n domains in symbol
domain_forwarding flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) bool no effect after records have been set
domain list of domains given either as string (* for universe set) or as reference to the Set/Alias object list -
domain_labels column headings for the records DataFrame list of str -
domain_names string version of domain names list of str -
domain_type none, relaxed or regular depending on state of domain links str -
is_scalar True if the len(self.domain) = 0 bool -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(self.records) if not None) int -
records the main symbol records pandas.DataFrame responsive to domain_forwarding state
ref_container reference to the Container that the symbol belongs to Container -
shape a tuple describing the array dimensions if records were converted with .toDense() tuple -
summary output a dict of only the metadata dict -

Methods

Method Description Arguments/Defaults Returns
countEps total number of SpecialValues.EPS in value column - int or None
countNA total number of SpecialValues.NA in value column - int or None
countNegInf total number of SpecialValues.NEGINF in value column - int or None
countPosInf total number of SpecialValues.POSINF in value column - int or None
countUndef total number of SpecialValues.UNDEF in value column - int or None
equals Used to compare the symbol to another symbol. If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_meta_data=True then check that symbol name and description are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) set equality tolerances. If verbose=True will return an exception from the asserter describing the nature of the difference. columns=["value"] (ignored)
check_uels=True (bool)
check_element_text=True (ignored)
check_meta_data=True (bool)
rtol=0.0 (float, None)
atol=0.0 (float, None)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for parameters. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
addUELs adds UELs to the symbol dimensions. If dimensions is None then add UELs to all dimensions. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
None
getUELs gets UELs from symbol dimensions. If dimensions is None then get UELs from all dimensions (maintains order). The argument codes accepts a list of str UELs and will return the corresponding int; must specify a single dimension if passing codes. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. dimensions=None (int, list, None)
codes=None (int, list, None)
ignore_unused=False (bool)
list
setUELs set the UELs for symbol dimensions. If dimensions is None then set UELs for all dimensions. If rename=True, then the old UEL names will be renamed with the new UEL names. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
rename=False (bool)
None
removeUELs removes UELs that appear in the symbol dimensions, If uels is None then remove all unused UELs (categories). If dimensions is None then operate on all dimensions. uels=None (str, list, None)
dimensions=None (int, list, None)
bool
renameUELs renames UELs (case-sensitive) that appear in the symbol dimensions. If dimensions is None then operate on all dimensions of the symbol. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (str, list, dict)
dimensions (int, list, None)
allow_merge=False (bool)
None
reorderUELs reorders the UELs in the symbol dimensions. If dimensions is None then reorder UELs in all dimensions of the symbol. uels (str, list, dict)
dimensions (int, list, None)
None
hasDomainViolations returns True if there are domain violations in the records, returns False if not. - bool
countDomainViolations returns the count of how many records contain at least one domain violation - int
dropDomainViolations drop records from the symbol that contain a domain violation - None
findDomainViolations get a view of the records DataFrame that contain any domain violations - pandas.DataFrame
hasDuplicateRecords returns True if there are (case insensitive) duplicate records in the symbol, returns False if not. - bool
countDuplicateRecords returns the count of how many (case insensitive) duplicate records exist - int
dropDuplicateRecords drop records with (case insensitive) duplicate domains from the symbol – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
findDuplicateRecords get a view of the records DataFrame that contain any (case insensitive) duplicate domains – keep argument can take values of "first" (finds all duplicates while keeping the first instance as unique), "last" (finds all duplicates while keeping the last instance as unique), or False (finds all duplicates) keep="first" pandas.DataFrame
findEps find positions of SpecialValues.EPS in value column - pandas.DataFrame or None
findNA find positions of SpecialValues.NA in value column - pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in value column - pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in value column - pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in value column - pandas.DataFrame or None
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
getMaxValue get the maximum value in value column - float or None
getMinValue get the minimum value in value column - float or None
getMeanValue get the mean value in value column - float or None
getMaxAbsValue get the maximum absolute value in value column - float or None
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, recheck a symbol if force=True verbose=False
force=True
bool
setRecords main convenience method to set standard pandas.DataFrame records records (many types) None
generateRecords convenience method to set standard pandas.DataFrame formatted records given domain set information. Will generate records with the Cartesian product of all domain sets. The densities argument can take any value on the interval [0,1]. If densities is <1 then randomly selected records will be removed. `densities` will accept a `list` of length `dimension` -- allows users to specify a density per symbol dimension. Random number state can be set with `seed` argument. densities=1.0 (float, list)
func=numpy.random.uniform(0,1) (callable)
seed=None (int, None)
None
toDense convert symbol to a dense numpy.array format - numpy.array or None
toSparseCoo convert symbol to a sparse COOrdinate numpy.array format - sparse matrix format or None
whereMax find the domain entry of records with a maximum value (return first instance only) - list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) - list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) - list of str or None

Adding Parameter Records

Three possibilities exist to assign symbol records to a parameter (roughly ordered in complexity):

  1. Setting the argument records in the set constructor/container method (internally calls setRecords) - creates a data copy
  2. Using the symbol method setRecords - creates a data copy
  3. Setting the property records directly - does not create a data copy

If the data is in a convenient format, a user may want to pass the records directly within the parameter constructor. This is an optional keyword argument and internally the parameter constructor will simply call the setRecords method. The symbol method setRecords is a convenience method that transforms the given data into an approved Pandas DataFrame format (see Standard Data Formats). Many native python data types can be easily transformed into DataFrames, so the setRecords method for Set objects will accept a number of different types for input. The setRecords method is called internally on any data structure that is passed through the records argument. We show a few examples of ways to create differently structured parameters:

Example #1 - Create a GAMS scalar
import gams.transfer as gt
m = gt.Container()
pi = gt.Parameter(m, "pi", records=3.14159)
# NOTE: the above syntax is equivalent to -
# pi = gt.Parameter(m, "pi")
# pi.setRecords(3.14159)
# NOTE: the above syntax is also equivalent to -
# m.addParameter("pi", records=3.14159)
# NOTE: the above syntax is also equivalent to -
# pi = m.addParameter("pi")
# pi.setRecords(3.14159)
# NOTE: the above syntax is also equivalent to -
# m.addParameter("pi")
# m["pi"].setRecords(3.14159)
In [14]: pi.records
Out[14]:
value
0 3.14159
Note
Transfer will still convert scalar values to a standard format (i.e., a Pandas DataFrame with a single row and column).
Example #2 - Create a 1D parameter (defined over *) from a list of tuples
import gams.transfer as gt
m = gt.Container()
i = gt.Parameter(m, "i", ["*"], records=[("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is equivalent to -
# i = gt.Parameter(m, "i")
# i.setRecords([("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# m.addParameter("i", records=[("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# i = m.addParameter("i")
# i.setRecords([("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# m.addParameter("i")
# m["i"].setRecords([("i" + str(i), i) for i in range(5)])
In [1]: i.records
Out[1]:
uni_0 value
0 i0 0.0
1 i1 1.0
2 i2 2.0
3 i3 3.0
4 i4 4.0
Example #3 - Create a 1D parameter (defined over a set) from a list of tuples
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=["i" + str(i) for i in range(5)])
a = gt.Parameter(m, "a", i, records=[("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is equivalent to -
# i = gt.Set(m, "i")
# i.setRecords(["i" + str(i) for i in range(5)])
# a = gt.Parameter(m, "a", i)
# a.setRecords([("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# m.addSet("i", records=["i" + str(i) for i in range(5)])
# m.addParameter("a", i, records=[("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# i = m.addSet("i")
# i.setRecords(["i" + str(i) for i in range(5)])
# a = m.addParameter("a", i)
# a.setRecords([("i" + str(i), i) for i in range(5)])
# NOTE: the above syntax is also equivalent to -
# m.addSet("i")
# m["i"].setRecords(["i" + str(i) for i in range(5)])
# m.addParameter("a", i)
# m["a"].setRecords([("i" + str(i), i) for i in range(5)])
In [1]: a.records
Out[1]:
i_0 value
0 i0 0.0
1 i1 1.0
2 i2 2.0
3 i3 3.0
4 i4 4.0
Example #4 - Create a 2D parameter (defined over a set) from a DataFrame slice
import gams.transfer as gt
import pandas as pd
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=dist["from"].unique())
j = gt.Set(m, "j", ["*"], records=dist["to"].unique())
a = gt.Parameter(m, "a", [i, j], records=dist.loc[[0, 3], :])
# NOTE: the above syntax is equivalent to -
# i = gt.Set(m, "i")
# i.setRecords(dist["from"].unique())
# j = gt.Set(m, "j")
# j.setRecords(dist["to"].unique())
# a = gt.Parameter(m, "a", [i, j])
# a.setRecords(dist.loc[[0, 3], :])
# NOTE: the above syntax is also equivalent to -
# m.addSet("i", records=dist["from"].unique())
# m.addSet("j", records=dist["to"].unique())
# m.addParameter("a", i, records=dist.loc[[0, 3], :])
In [1]: a.records
Out[1]:
i_0 j_1 value
0 seattle new-york 2.5
3 san-diego new-york 2.5
Note
The original indexing is preserved when a user slices rows out of a reference dataframe.
Example #5 - Create a 2D parameter (defined over a set) from a matrix
import gams.transfer as gt
import pandas as pd
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=dist["from"].unique())
j = gt.Set(m, "j", ["*"], records=dist["to"].unique())
a = gt.Parameter(m, "a", [i, j], records=dist)
In [1]: a.toDense()
Out[1]:
array([[2.5, 1.7, 1.8],
[2.5, 1.8, 1.4]])
# use a.toDense() to create a new (and identical) parameter a2
a2 = gt.Parameter(m, "a2", [i, j], records=a.toDense())
# check that a is identical to a2
In [1]: a.equals(a2, check_meta_data=False)
Out[1]: True
Example #6 - Create a 2D parameter from an array using setRecords
import gams.transfer as gt
import numpy as np
import pandas as pd
m = gt.Container()
i = gt.Set(m, "i", records=["i" + str(i) for i in range(5)])
j = gt.Set(m, "j", records=["j" + str(j) for j in range(5)])
# create the parameter with linked domains (these will control the .shape of the symbol)
a = gt.Parameter(m, "a", [i, j])
# here we use the .shape property to easily generate a dense random array in numpy
a.setRecords(np.random.uniform(low=1, high=10, size=a.shape))
In [1]: a.toDense()
Out[1]:
array([[3.6694495 , 5.17395381, 1.99129484, 3.28315433, 1.44793791],
[1.06953243, 6.56331121, 5.26162554, 5.98098795, 8.30006 ],
[3.77213221, 5.82144901, 9.30035479, 9.12534285, 8.51970747],
[8.47965504, 7.84426304, 5.2442471 , 6.96666622, 6.55194415],
[5.62682779, 4.92509183, 8.94579609, 2.7724934 , 9.99576081]])

Directly Set Records

As with sets, the primary advantage of the setRecords method is that Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.

Example #1 - Correctly set records (directly)
import gams.transfer as gt
import pandas as pd
import numpy as np
df = pd.DataFrame(
data=[
("h" + str(h), "m" + str(m), "s" + str(s))
for h in range(8760)
for m in range(60)
for s in range(60)
],
columns=["h_0", "m_1", "s_2"],
)
df["value"] = np.random.uniform(0, 100, len(df))
m = gt.Container()
hrs = gt.Set(m, "h", records=df["h_0"].unique())
mins = gt.Set(m, "m", records=df["m_1"].unique())
secs = gt.Set(m, "s", records=df["s_2"].unique())
df["h_0"] = df["h_0"].astype(hrs.records["uni_0"].dtype)
df["m_1"] = df["m_1"].astype(mins.records["uni_0"].dtype)
df["s_2"] = df["s_2"].astype(secs.records["uni_0"].dtype)
a = gt.Parameter(m, "a", [hrs, mins, secs])
# set records
a.records = df
In [1]: a.isValid()
Out[1]: True

In this example we create a large parameter (31,536,000 records and 8880 unique domain elements – we mimic data that is labeled for every second in one year) and assign it to a parameter with a.records. Transfer requires that all domain columns must be a categorical data type, furthermore, this categorical must be ordered. The records setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True.

Users will need to use the .isValid(verbose=True) method to debug any structural issues. As an example we incorrectly generate categorical data types by passing the DataFrame constructor the generic dtype="category" argument. This creates categorical column types but they are not ordered and they do not reference the underlying domain set. These errors result in a being invalid.

Example #2 - Incorrectly set records (directly)
import gams.transfer as gt
import pandas as pd
import numpy as np
df = pd.DataFrame(
data=[
("h" + str(h), "m" + str(m), "s" + str(s))
for h in range(8760)
for m in range(60)
for s in range(60)
],
columns=["h_0", "m_1", "s_2"],
dtype="category"
)
df["value"] = np.random.uniform(0, 100, len(df))
m = gt.Container()
hrs = gt.Set(m, "h", records=df["h_0"].unique())
mins = gt.Set(m, "m", records=df["m_1"].unique())
secs = gt.Set(m, "s", records=df["s_2"].unique())
a = gt.Parameter(m, "a", [hrs, mins, secs])
# set the records directly
a.records = df
In [1]: a.isValid()
Out[1]: False
In [2]: a.isValid(verbose=True)
Out[2]: Exception: Domain information in column 'h_0' for 'records' must be an ORDERED categorical type (i.e., <symbol_object>.records[h_0].dtype.ordered = True)

Generate Parameter Records

Generating the initial pandas.DataFrame object could be difficult for Parameter symbols that have a large number of records and a small number of UELs – these higher dimensional symbols will benefit from the generateRecords convenience function. Internally, generateRecords computes the dense Cartesian product of all the domain sets that define a symbol (generateRecords will only work on symbols where <symbol>.domain_type == "regular").

Example #1 - Create a large (dense) 4D parameter
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Parameter(m, "a", [i, j, k, l])
# generate the records
a.generateRecords()
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 value
0 i0 j0 k0 l0 0.386390
1 i0 j0 k0 l1 0.671253
2 i0 j0 k0 l2 0.522057
3 i0 j0 k0 l3 0.037694
4 i0 j0 k0 l4 0.564205
... ... ... ... ... ...
6249995 i49 j49 k49 l45 0.573354
6249996 i49 j49 k49 l46 0.033717
6249997 i49 j49 k49 l47 0.410322
6249998 i49 j49 k49 l48 0.758310
6249999 i49 j49 k49 l49 0.920708
[6250000 rows x 5 columns]
Note
In Example #1 a large 4D parameter was generated – by default, the value of these records are randomly drawn numbers from the interval [0,1] (uniform distribution).

As with Sets, it is possible to generate a sparse parameter with the densities argument to generateRecords. We extend this example by passing our own custom func argument that will control the behavior of the value columns. The func argument accepts a callable (i.e., a reference to a function).

Example #2 - Create a large (sparse) 4D parameter with normally distributed values
import gams.transfer as gt
import numpy as np
# create a custom function to pass to `generateRecords`
def value_dist(size):
return np.random.normal(loc=10.0, scale=2.3, size=size)
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Parameter(m, "a", [i, j, k, l])
# generate the records
a.generateRecords(densities=0.05, func=value_dist)
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 value
0 i0 j0 k0 l33 12.490579
1 i0 j0 k0 l43 9.460560
2 i0 j0 k0 l44 7.660337
3 i0 j0 k0 l47 8.811967
4 i0 j0 k1 l5 11.103291
... ... ... ... ... ...
312495 i49 j49 k48 l38 10.619791
312496 i49 j49 k48 l41 14.208250
312497 i49 j49 k48 l47 6.104145
312498 i49 j49 k49 l0 10.216812
312499 i49 j49 k49 l39 9.739771
[312500 rows x 5 columns]
In [3]: a.records["value"].mean()
Out[3]: 10.004072307451391
In [4]: a.records["value"].std()
Out[4]: 2.292569938350144
Note
The custom callable function reference must expose a size argument. It might be tedious to know the exact number of the records that will be generated, especially if a fractional density is specified; therefore, the generateRecords method will pass in the correct size automatically. Users are encouraged to use the Numpy suite of random distributions when generating samples – custom functions have the potential to be computationally burdensome if a symbol has a large number of records.
Example #3 - Create a large 4D parameter with 1 sparse dimension
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Parameter(m, "a", [i, j, k, l])
# generate the records
a.generateRecords(densities=[1, 0.05, 1, 1])
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 value
0 i0 j30 k0 l0 0.473084
1 i0 j30 k0 l1 0.192571
2 i0 j30 k0 l2 0.060711
3 i0 j30 k0 l3 0.655477
4 i0 j30 k0 l4 0.629535
... ... ... ... ... ...
249995 i49 j32 k49 l45 0.442380
249996 i49 j32 k49 l46 0.002444
249997 i49 j32 k49 l47 0.332731
249998 i49 j32 k49 l48 0.983800
249999 i49 j32 k49 l49 0.984322
[250000 rows x 5 columns]
Example #4 - Create a large 4D parameter with a random number seed
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Parameter(m, "a", [i, j, k, l])
a2 = gt.Parameter(m, "a2", [i, j, k, l])
# generate the records
a.generateRecords(densities=0.05, seed=123)
a2.generateRecords(densities=0.05)
In [1]: a.equals(a2, check_meta_data=False)
Out[1]: False
In [2]: a2.generateRecords(densities=0.05, seed=123)
In [3]: a.equals(a2, check_meta_data=False)
Out[3]: True
Note
The seed is an int that will set the random number generator state (enables reproducible sequences of random numbers).

Variable

There are two different ways to create a GAMS variable and add it to a Container.

  1. Use Variable constructor
  2. Use the Container method addVariable (which internally calls the Variable constructor)

Constructor

Constructor Arguments
Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
type str Type of variable being created [binary, integer, positive, negative, free, sos1, sos2, semicont, semiint] No free
domain list List of domains given either as string (* for universe set) or as reference to a Set/Alias object, an empty domain list will create a scalar variable No []
records many Symbol records No None
domain_forwarding bool Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) No False
description str Description of symbol No ""

Properties

Property Description Type Special Setter Behavior
description description of symbol str -
dimension dimension of symbol int setting is a shorthand notation to create ["*"] * n domains in symbol
domain_forwarding flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) bool no effect after records have been set
domain list of domains given either as string (* for universe set) or as reference to the Set/Alias object list -
domain_labels column headings for the records DataFrame list of str -
domain_names string version of domain names list of str -
domain_type none, relaxed or regular depending on state of domain links str -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(self.records) if not None) int -
records the main symbol records pandas.DataFrame responsive to domain_forwarding state
ref_container reference to the Container that the symbol belongs to Container -
shape a tuple describing the array dimensions if records were converted with .toDense() tuple -
summary output a dict of only the metadata dict -
type str type of variable str -

Methods

Method Description Arguments/Defaults Returns
countEps total number of SpecialValues.EPS across all columns columns="level" (str, list) int or None
countNA total number of SpecialValues.NA across all columns columns="level" (str, list) int or None
countNegInf total number of SpecialValues.NEGINF across all columns columns="level" (str, list) int or None
countPosInf total number of SpecialValues.POSINF across all columns columns="level" (str, list) int or None
countUndef total number of SpecialValues.UNDEF across all columns columns="level" (str, list) int or None
equals Used to compare the symbol to another symbol. The columns argument allows the user to numerically compare only specified variable attributes (default is to compare all). If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_meta_data=True then check that symbol name, description and variable type are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) set equality tolerances; can be different tolerances for different variable attributes (if specified as a dict). If verbose=True will return an exception from the asserter describing the nature of the difference. columns=["level", "marginal", "lower", "upper", "scale"]
check_uels=True (bool)
check_element_text=True (ignored)
check_meta_data=True (bool)
rtol=0.0 (int, float, None)
atol=0.0 (int, float, None)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. If value is None then the level values will be pivoted. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
value (str)
fill_value=None (int, float, str)
pd.DataFrame
addUELs adds UELs to the symbol dimensions. If dimensions is None then add UELs to all dimensions. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
None
getUELs gets UELs from symbol dimensions. If dimensions is None then get UELs from all dimensions (maintains order). The argument codes accepts a list of str UELs and will return the corresponding int; must specify a single dimension if passing codes. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. dimensions=None (int, list, None)
codes=None (int, list, None)
ignore_unused=False (bool)
list
setUELs set the UELs for symbol dimensions. If dimensions is None then set UELs for all dimensions. If rename=True, then the old UEL names will be renamed with the new UEL names. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
rename=False (bool)
None
removeUELs removes UELs that appear in the symbol dimensions, If uels is None then remove all unused UELs (categories). If dimensions is None then operate on all dimensions. uels=None (str, list, None)
dimensions=None (int, list, None)
bool
renameUELs renames UELs (case-sensitive) that appear in the symbol dimensions. If dimensions is None then operate on all dimensions of the symbol. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (str, list, dict)
dimensions (int, list, None)
allow_merge=False (bool)
None
reorderUELs reorders the UELs in the symbol dimensions. If dimensions is None then reorder UELs in all dimensions of the symbol. uels (str, list, dict)
dimensions (int, list, None)
None
hasDomainViolations returns True if there are domain violations in the records, returns False if not. - bool
countDomainViolations returns the count of how many records contain at least one domain violation - int
dropDomainViolations drop records from the symbol that contain a domain violation - None
findDomainViolations get a view of the records DataFrame that contain any domain violations - pandas.DataFrame
hasDuplicateRecords returns True if there are (case insensitive) duplicate records in the symbol, returns False if not. - bool
countDuplicateRecords returns the count of how many (case insensitive) duplicate records exist - int
dropDuplicateRecords drop records with (case insensitive) duplicate domains from the symbol – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
findDuplicateRecords get a view of the records DataFrame that contain any (case insensitive) duplicate domains – keep argument can take values of "first" (finds all duplicates while keeping the first instance as unique), "last" (finds all duplicates while keeping the last instance as unique), or False (finds all duplicates) keep="first" pandas.DataFrame
findEps find positions of SpecialValues.EPS in column column="level" (str) pandas.DataFrame or None
findNA find positions of SpecialValues.NA in column column="level" (str) pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in column column="level" (str) pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in column column="level" (str) pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in column column="level" (str) pandas.DataFrame or None
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
getMaxValue get the maximum value across all columns columns="level" (str, list) float or None
getMinValue get the minimum value across all columns columns="level" (str, list) float or None
getMeanValue get the mean value across all columns columns="level" (str, list) float or None
getMaxAbsValue get the maximum absolute value across all columns columns="level" (str, list) float or None
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, recheck a symbol if force=True verbose=False
force=True
bool
setRecords main convenience method to set standard pandas.DataFrame records records (many types) None
generateRecords convenience method to set standard pandas.DataFrame formatted records given domain set information. Will generate records with the Cartesian product of all domain sets. The densities argument can take any value on the interval [0,1]. If densities is <1 then randomly selected records will be removed. `densities` will accept a `list` of length `dimension` -- allows users to specify a density per symbol dimension. Random number state can be set with `seed` argument. densities=1.0 (float, list)
func=numpy.random.uniform(0,1) (dict of callables)
seed=None (int, None)
None
toDense convert column to a dense numpy.array format column="level" (str) numpy.array or None
toSparseCoo convert column to a sparse COOrdinate numpy.array format column="level" (str) sparse matrix format or None
whereMax find the domain entry of records with a maximum value (return first instance only) column="level" (str) list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) column="level" (str) list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) column="level" (str) list of str or None

Adding Variable Records

Three possibilities exist to assign symbol records to a variable (roughly ordered in complexity):

  1. Setting the argument records in the set constructor/container method (internally calls setRecords) - creates a data copy
  2. Using the symbol method setRecords - creates a data copy
  3. Setting the property records directly - does not create a data copy

If the data is in a convenient format, a user may want to pass the records directly within the variable constructor. This is an optional keyword argument and internally the variable constructor will simply call the setRecords method. In contrast to the setRecords methods in in either the Set or Parameter classes the setRecords method for variables will only accept Pandas DataFrames and specially structured dict for creating records from matrices. This restriction is out of necessity because to properly set a record for a Variable the user must pass data for the level, marginal, lower, upper and scale attributes. That said, any missing attributes will be filled in with the GAMS default record values (see: Variable Types), default scale value is always 1, and the default level and marginal values are 0 for all variable types). We show a few examples of ways to create differently structured variables:

Example #1 - Create a GAMS scalar variable
import gams.transfer as gt
m = gt.Container()
pi = gt.Variable(m, "pi", records=pd.DataFrame(data=[3.14159], columns=["level"]))
# NOTE: the above syntax is equivalent to -
# pi = gt.Variable(m, "pi", "free")
# pi.setRecords(pd.DataFrame(data=[3.14159], columns=["level"]))
# NOTE: the above syntax is also equivalent to -
# m.addVariable("pi", "free", records=pd.DataFrame(data=[3.14159], columns=["level"]))
In [1]: pi.records
Out[1]:
level marginal lower upper scale
0 3.14159 0.0 -inf inf 1.0
Example #2 - Create a 1D variable (defined over *) from a list of tuples

In this example we only set the marginal values.

import gams.transfer as gt
m = gt.Container()
v = gt.Variable(
m,
"v",
"free",
domain=["*"],
records=pd.DataFrame(
data=[("i" + str(i), i) for i in range(5)], columns=["domain", "marginal"]
),
)
In [1]: v.records
Out[1]:
uni_0 level marginal lower upper scale
0 i0 0.0 0.0 -inf inf 1.0
1 i1 0.0 1.0 -inf inf 1.0
2 i2 0.0 2.0 -inf inf 1.0
3 i3 0.0 3.0 -inf inf 1.0
4 i4 0.0 4.0 -inf inf 1.0
Example #3 - Create a 1D variable (defined over a set) from a list of tuples
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=["i" + str(i) for i in range(5)])
v = gt.Variable(
m,
"v",
"free",
domain=i,
records=pd.DataFrame(
data=[("i" + str(i), i) for i in range(5)], columns=["domain", "marginal"]
),
)
In [1]: v.records
Out[1]:
i_0 level marginal lower upper scale
0 i0 0.0 0.0 -inf inf 1.0
1 i1 0.0 1.0 -inf inf 1.0
2 i2 0.0 2.0 -inf inf 1.0
3 i3 0.0 3.0 -inf inf 1.0
4 i4 0.0 4.0 -inf inf 1.0
Example #4 - Create a 2D positive variable, specifying no numerical data
import gams.transfer as gt
import pandas as pd
m = gt.Container()
v = gt.Variable(
m,
"v",
"positive",
["*", "*"],
records=pd.DataFrame([("seattle", "san-diego"), ("chicago", "madison")]),
)
In [1]: v.records
Out[1]:
uni_0 uni_1 level marginal lower upper scale
0 seattle san-diego 0.0 0.0 0.0 inf 1.0
1 chicago madison 0.0 0.0 0.0 inf 1.0
Example #5 - Create a 2D variable (defined over a set) from a matrix
import gams.transfer as gt
import pandas as pd
import numpy as np
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=["i" + str(i) for i in range(5)])
j = gt.Set(m, "j", ["*"], records=["j" + str(i) for i in range(5)])
a = gt.Parameter(
m,
"a",
[i, j],
records=[("i" + str(i), "j" + str(j), i + j) for i in range(5) for j in range(5)],
)
# create a free variable and set the level and marginal attributes from matricies
v = gt.Variable(
m, "v", domain=[i, j], records={"level": a.toDense(), "marginal": a.toDense()}
)
# if not specified, the toDense() method will convert the level values to a matrix
In [1]: v.toDense()
Out[1]:
array([[0., 1., 2., 3., 4.],
[1., 2., 3., 4., 5.],
[2., 3., 4., 5., 6.],
[3., 4., 5., 6., 7.],
[4., 5., 6., 7., 8.]])

Directly Set Records

As with sets, the primary advantage of the setRecords method is that Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.

Example #1 - Correctly set records (directly)
import gams.transfer as gt
import pandas as pd
import numpy as np
df = pd.DataFrame(
data=[
("h" + str(h), "m" + str(m), "s" + str(s))
for h in range(8760)
for m in range(60)
for s in range(60)
],
columns=["h_0", "m_1", "s_2"],
)
# it is necessary to specify all variable attributes if setting records directly
# NOTE: all numeric data must be type float
df["level"] = np.random.uniform(0, 100, len(df))
df["marginal"] = 0.0
df["lower"] = gt.SpecialValues.NEGINF
df["upper"] = gt.SpecialValues.POSINF
df["scale"] = 1.0
m = gt.Container()
hrs = gt.Set(m, "h", records=df["h_0"].unique())
mins = gt.Set(m, "m", records=df["m_1"].unique())
secs = gt.Set(m, "s", records=df["s_2"].unique())
df["h_0"] = df["h_0"].astype(hrs.records["uni_0"].dtype)
df["m_1"] = df["m_1"].astype(mins.records["uni_0"].dtype)
df["s_2"] = df["s_2"].astype(secs.records["uni_0"].dtype)
a = gt.Variable(m, "a", domain=[hrs, mins, secs])
# set records
a.records = df
In [1]: a.isValid()
Out[1]: True
Attention
All numeric data in the records will need to be type float in order to maintain a valid symbol.

In this example we create a large variable (31,536,000 records and 8880 unique domain elements – we mimic data that is labeled for every second in one year) and assign it to a variable with a.records. Transfer requires that all domain columns must be a categorical data type, furthermore this categorical must be ordered. The records setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True. As with Set and Parameters, users can use the .isValid(verbose=True) method to debug any structural issues.

Generate Variable Records

Generating the initial pandas.DataFrame object could be difficult for Variable symbols that have a large number of records and a small number of UELs – these higher dimensional symbols will benefit from the generateRecords convenience function. Internally, generateRecords computes the dense Cartesian product of all the domain sets that define a symbol (generateRecords will only work on symbols where <symbol>.domain_type == "regular").

Example #1 - Create a large (dense) 4D variable
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Variable(m, "a", "free", [i, j, k, l])
# generate the records
a.generateRecords()
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 level marginal lower upper scale
0 i0 j0 k0 l0 0.470248 0.0 -inf inf 1.0
1 i0 j0 k0 l1 0.924286 0.0 -inf inf 1.0
2 i0 j0 k0 l2 0.347550 0.0 -inf inf 1.0
3 i0 j0 k0 l3 0.937009 0.0 -inf inf 1.0
4 i0 j0 k0 l4 0.050716 0.0 -inf inf 1.0
... ... ... ... ... ... ... ... ... ...
6249995 i49 j49 k49 l45 0.385032 0.0 -inf inf 1.0
6249996 i49 j49 k49 l46 0.029305 0.0 -inf inf 1.0
6249997 i49 j49 k49 l47 0.440716 0.0 -inf inf 1.0
6249998 i49 j49 k49 l48 0.432931 0.0 -inf inf 1.0
6249999 i49 j49 k49 l49 0.157107 0.0 -inf inf 1.0
[6250000 rows x 9 columns]
Note
In Example #1 a large 4D variable was generated – by default, only the level value of these records are randomly drawn from the interval [0,1] (uniform distribution). Other variable attributes take the default record value.

As with Parameters, it is possible to generate a sparse variable with the densities argument to generateRecords. We extend this example by passing our own custom func argument that will control the behavior of the value columns. The func argument accepts a dict of callables (i.e., a reference to a function).

Example #2 - Create a large (sparse) 4D variable with normally distributed values
import gams.transfer as gt
import numpy as np
# create a custom function to pass to `generateRecords`
def level_dist(size):
return np.random.normal(loc=10.0, scale=2.3, size=size)
def marginal_dist(size):
return np.random.normal(loc=0.5, scale=0.1, size=size)
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Variable(m, "a", "free", [i, j, k, l])
# generate the records
a.generateRecords(densities=0.05, func={"level":level_dist, "marginal":marginal_dist})
In [1]: a.isValid()
Out[1]: True
In [12]: a.records
Out[12]:
i_0 j_1 k_2 l_3 level marginal lower upper scale
0 i0 j0 k0 l36 11.105235 0.468989 -inf inf 1.0
1 i0 j0 k0 l40 5.697361 0.478019 -inf inf 1.0
2 i0 j0 k1 l17 11.900784 0.473814 -inf inf 1.0
3 i0 j0 k1 l24 10.105931 0.456925 -inf inf 1.0
4 i0 j0 k1 l31 8.444142 0.490966 -inf inf 1.0
... ... ... ... ... ... ... ... ... ...
312495 i49 j49 k47 l17 11.523186 0.508001 -inf inf 1.0
312496 i49 j49 k47 l20 9.341183 0.739237 -inf inf 1.0
312497 i49 j49 k47 l26 10.705808 0.581103 -inf inf 1.0
312498 i49 j49 k47 l32 7.910963 0.479655 -inf inf 1.0
312499 i49 j49 k49 l8 11.800414 0.628040 -inf inf 1.0
[312500 rows x 9 columns]
In [3]: a.records["level"].mean()
Out[3]: 10.004072307451391
In [4]: a.records["level"].std()
Out[4]: 2.292569938350144
In [5]: a.records["marginal"].mean()
Out[5]: 0.49970172269778
In [6]: a.records["marginal"].std()
Out[6]: 0.09998772109802055
Note
The custom callable function reference must expose a size argument. It might be tedious to know the exact number of the records that will be generated, especially if a fractional density is specified; therefore, the generateRecords method will pass in the correct size automatically. Users are encouraged to use the Numpy suite of random distributions when generating samples – custom functions have the potential to be computationally burdensome if a symbol has a large number of records.

Equation

There are two different ways to create a GAMS equation and add it to a Container.

  1. Use Equation constructor
  2. Use the Container method addEquation (which internally calls the Equation constructor)

Constructor

Constructor Arguments
Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
type str Type of equation being created [eq (or E/e), geq (or G/g), leq (or L/l), nonbinding (or N/n), external (or X/x)] Yes -
domain list List of domains given either as string (* for universe set) or as reference to a Set/Alias object, an empty domain list will create a scalar equation No []
records many Symbol records No None
domain_forwarding bool Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) No False
description str Description of symbol No ""

Properties

Property Description Type Special Setter Behavior
description description of symbol str -
dimension dimension of symbol int setting is a shorthand notation to create ["*"] * n domains in symbol
domain_forwarding flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) bool no effect after records have been set
domain list of domains given either as string (* for universe set) or as reference to the Set/Alias object list -
domain_labels column headings for the records DataFrame list of str -
domain_names string version of domain names list of str -
domain_type none, relaxed or regular depending on state of domain links str -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(self.records) if not None) int -
records the main symbol records pandas.DataFrame responsive to domain_forwarding state
ref_container reference to the Container that the symbol belongs to Container -
shape a tuple describing the array dimensions if records were converted with .toDense() tuple -
summary output a dict of only the metadata dict -
type str type of variable str -

Methods

Method Description Arguments/Defaults Returns
countEps total number of SpecialValues.EPS across all columns columns="level" (str, list) int or None
countNA total number of SpecialValues.NA across all columns columns="level" (str, list) int or None
countNegInf total number of SpecialValues.NEGINF across all columns columns="level" (str, list) int or None
countPosInf total number of SpecialValues.POSINF across all columns columns="level" (str, list) int or None
countUndef total number of SpecialValues.UNDEF across all columns columns="level" (str, list) int or None
equals Used to compare the symbol to another symbol. The columns argument allows the user to numerically compare only specified equation attributes (default is to compare all). If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_meta_data=True then check that symbol name, description and equation type are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) set equality tolerances; can be different tolerances for different equation attributes (if specified as a dict). If verbose=True will return an exception from the asserter describing the nature of the difference. columns=["level", "marginal", "lower", "upper", "scale"]
check_uels=True (bool)
check_element_text=True (ignored)
check_meta_data=True (bool)
rtol=0.0 (int, float, None)
atol=0.0 (int, float, None)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. If value is None then the level values will be pivoted. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
value (str)
fill_value=None (int, float, str)
pd.DataFrame
addUELs adds UELs to the symbol dimensions. If dimensions is None then add UELs to all dimensions. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
None
getUELs gets UELs from symbol dimensions. If dimensions is None then get UELs from all dimensions (maintains order). The argument codes accepts a list of str UELs and will return the corresponding int; must specify a single dimension if passing codes. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. dimensions=None (int, list, None)
codes=None (int, list, None)
ignore_unused=False (bool)
list
setUELs set the UELs for symbol dimensions. If dimensions is None then set UELs for all dimensions. If rename=True, then the old UEL names will be renamed with the new UEL names. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
rename=False (bool)
None
removeUELs removes UELs that appear in the symbol dimensions, If uels is None then remove all unused UELs (categories). If dimensions is None then operate on all dimensions. uels=None (str, list, None)
dimensions=None (int, list, None)
bool
renameUELs renames UELs (case-sensitive) that appear in the symbol dimensions. If dimensions is None then operate on all dimensions of the symbol. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (str, list, dict)
dimensions (int, list, None)
allow_merge=False (bool)
None
reorderUELs reorders the UELs in the symbol dimensions. If dimensions is None then reorder UELs in all dimensions of the symbol. uels (str, list, dict)
dimensions (int, list, None)
None
hasDomainViolations returns True if there are domain violations in the records, returns False if not. - bool
countDomainViolations returns the count of how many records contain at least one domain violation - int
dropDomainViolations drop records from the symbol that contain a domain violation - None
findDomainViolations get a view of the records DataFrame that contain any domain violations - pandas.DataFrame
hasDuplicateRecords returns True if there are (case insensitive) duplicate records in the symbol, returns False if not. - bool
countDuplicateRecords returns the count of how many (case insensitive) duplicate records exist - int
dropDuplicateRecords drop records with (case insensitive) duplicate domains from the symbol – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
findDuplicateRecords get a view of the records DataFrame that contain any (case insensitive) duplicate domains – keep argument can take values of "first" (finds all duplicates while keeping the first instance as unique), "last" (finds all duplicates while keeping the last instance as unique), or False (finds all duplicates) keep="first" pandas.DataFrame
findEps find positions of SpecialValues.EPS in column column="level" (str) pandas.DataFrame or None
findNA find positions of SpecialValues.NA in column column="level" (str) pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in column column="level" (str) pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in column column="level" (str) pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in column column="level" (str) pandas.DataFrame or None
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
getMaxValue get the maximum value across all columns columns="level" (str, list) float or None
getMinValue get the minimum value across all columns columns="level" (str, list) float or None
getMeanValue get the mean value across all columns columns="level" (str, list) float or None
getMaxAbsValue get the maximum absolute value across all columns columns="level" (str, list) float or None
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, recheck a symbol if force=True verbose=False
force=True
bool
setRecords main convenience method to set standard pandas.DataFrame records records (many types) None
generateRecords convenience method to set standard pandas.DataFrame formatted records given domain set information. Will generate records with the Cartesian product of all domain sets. The densities argument can take any value on the interval [0,1]. If densities is <1 then randomly selected records will be removed. `densities` will accept a `list` of length `dimension` -- allows users to specify a density per symbol dimension. Random number state can be set with `seed` argument. densities=1.0 (float, list)
func=numpy.random.uniform(0,1) (dict of callables)
seed=None (int, None)
None
toDense convert column to a dense numpy.array format column="level" (str) numpy.array or None
toSparseCoo convert column to a sparse COOrdinate numpy.array format column="level" (str) sparse matrix format or None
whereMax find the domain entry of records with a maximum value (return first instance only) column="level" (str) list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) column="level" (str) list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) column="level" (str) list of str or None

Adding Equation Records

Adding equation records mimics that of variables – three possibilities exist to assign symbol records to an equation (roughly ordered in complexity):

  1. Setting the argument records in the set constructor/container method (internally calls setRecords) - creates a data copy
  2. Using the symbol method setRecords - creates a data copy
  3. Setting the property records directly - does not create a data copy

Setting equation records require the user to be explicit with the type of equation that is being created; in contrast to setting variable records (where the default variable is considered to be free).

If the data is in a convenient format, a user may want to pass the records directly within the equation constructor. This is an optional keyword argument and internally the equation constructor will simply call the setRecords method. In contrast to the setRecords methods in in either the Set or Parameter classes the setRecords method for variables will only accept Pandas DataFrames and specially structured dict for creating records from matrices. This restriction is out of necessity because to properly set a record for an Equation the user must pass data for the level, marginal, lower, upper and scale attributes. That said, any missing attributes will be filled in with the GAMS default record values (level = 0.0, marginal = 0.0, lower = -inf, upper = inf, scale = 1.0). We show a few examples of ways to create differently structured variables:

Example #1 - Create a GAMS scalar equation
import gams.transfer as gt
m = gt.Container()
# here we create an equality (=E=) equation
z = gt.Equation(m, "z", "eq", records=pd.DataFrame(data=[3.14159], columns=["level"]))
# NOTE: the above syntax is equivalent to -
# pi = gt.Equation(m, "pi", "eq")
# pi.setRecords(pd.DataFrame(data=[3.14159], columns=["level"]))
# NOTE: the above syntax is also equivalent to -
# m.addEquation("pi", "eq", records=pd.DataFrame(data=[3.14159], columns=["level"]))
In [1]: pi.records
Out[1]:
level marginal lower upper scale
0 3.14159 0.0 -inf inf 1.0
Example #2 - Create a 1D Equation (defined over *) from a list of tuples

In this example we only set the marginal values.

import gams.transfer as gt
m = gt.Container()
# here we define a greater than or equal (=G=) equation
i = gt.Equation(
m,
"i",
"geq",
domain=["*"],
records=pd.DataFrame(
data=[("i" + str(i), i) for i in range(5)], columns=["domain", "marginal"]
),
)
In [1]: i.type
Out[1]: 'geq'
In [2]: i.records
Out[2]:
uni_0 level marginal lower upper scale
0 i0 0.0 0.0 -inf inf 1.0
1 i1 0.0 1.0 -inf inf 1.0
2 i2 0.0 2.0 -inf inf 1.0
3 i3 0.0 3.0 -inf inf 1.0
4 i4 0.0 4.0 -inf inf 1.0
Example #3 - Create a 1D Equation (defined over a set) from a list of tuples
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=["i" + str(i) for i in range(5)])
# here we define a less than or equal (=L=) equation
e = gt.Equation(
m,
"e",
"leq",
domain=i,
records=pd.DataFrame(
data=[("i" + str(i), i) for i in range(5)], columns=["domain", "marginal"]
),
)
In [1]: i.type
Out[1]: 'leq'
In [5]: e.records
Out[5]:
i_0 level marginal lower upper scale
0 i0 0.0 0.0 -inf inf 1.0
1 i1 0.0 1.0 -inf inf 1.0
2 i2 0.0 2.0 -inf inf 1.0
3 i3 0.0 3.0 -inf inf 1.0
4 i4 0.0 4.0 -inf inf 1.0
Example #4 - Create a 2D equation, specifying no numerical data
import gams.transfer as gt
import pandas as pd
m = gt.Container()
e = gt.Equation(
m,
"e",
"eq",
["*", "*"],
records=pd.DataFrame([("seattle", "san-diego"), ("chicago", "madison")]),
)
In [1]: e.records
Out[1]:
uni_0 uni_1 level marginal lower upper scale
0 seattle san-diego 0.0 0.0 -inf inf 1.0
1 chicago madison 0.0 0.0 -inf inf 1.0
Example #5 - Create a 2D equation (defined over a set) from a matrix
import gams.transfer as gt
import pandas as pd
import numpy as np
m = gt.Container()
i = gt.Set(m, "i", ["*"], records=["i" + str(i) for i in range(5)])
j = gt.Set(m, "j", ["*"], records=["j" + str(i) for i in range(5)])
a = gt.Parameter(
m,
"a",
[i, j],
records=[("i" + str(i), "j" + str(j), i + j) for i in range(5) for j in range(5)],
)
# create a nonbinding (=N=) equation and set the level and marginal attributes from matricies
e = gt.Equation(
m, "e", "nonbinding", domain=[i, j], records={"level": a.toDense(), "marginal": a.toDense()}
)
In [1]: e.records
Out[1]:
i_0 j_1 level marginal lower upper scale
0 i0 j1 1.0 1.0 -inf inf 1.0
1 i0 j2 2.0 2.0 -inf inf 1.0
2 i0 j3 3.0 3.0 -inf inf 1.0
3 i0 j4 4.0 4.0 -inf inf 1.0
4 i1 j0 1.0 1.0 -inf inf 1.0
5 i1 j1 2.0 2.0 -inf inf 1.0
6 i1 j2 3.0 3.0 -inf inf 1.0
7 i1 j3 4.0 4.0 -inf inf 1.0
8 i1 j4 5.0 5.0 -inf inf 1.0
9 i2 j0 2.0 2.0 -inf inf 1.0
10 i2 j1 3.0 3.0 -inf inf 1.0
11 i2 j2 4.0 4.0 -inf inf 1.0
12 i2 j3 5.0 5.0 -inf inf 1.0
13 i2 j4 6.0 6.0 -inf inf 1.0
14 i3 j0 3.0 3.0 -inf inf 1.0
15 i3 j1 4.0 4.0 -inf inf 1.0
16 i3 j2 5.0 5.0 -inf inf 1.0
17 i3 j3 6.0 6.0 -inf inf 1.0
18 i3 j4 7.0 7.0 -inf inf 1.0
19 i4 j0 4.0 4.0 -inf inf 1.0
20 i4 j1 5.0 5.0 -inf inf 1.0
21 i4 j2 6.0 6.0 -inf inf 1.0
22 i4 j3 7.0 7.0 -inf inf 1.0
23 i4 j4 8.0 8.0 -inf inf 1.0
# if not specified, the toDense() method will convert the level values to a matrix
In [2]: e.toDense()
Out[2]:
array([[0., 1., 2., 3., 4.],
[1., 2., 3., 4., 5.],
[2., 3., 4., 5., 6.],
[3., 4., 5., 6., 7.],
[4., 5., 6., 7., 8.]])

Directly Set Records

As with set, parameters and variables, the primary advantage of the setRecords method is that Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.

Example #1 - Correctly set records (directly)
import gams.transfer as gt
import pandas as pd
import numpy as np
df = pd.DataFrame(
data=[
("h" + str(h), "m" + str(m), "s" + str(s))
for h in range(8760)
for m in range(60)
for s in range(60)
],
columns=["h_0", "m_1", "s_2"],
)
# it is necessary to specify all variable attributes if setting records directly
# NOTE: all numeric data must be type float
df["level"] = np.random.uniform(0, 100, len(df))
df["marginal"] = 0.0
df["lower"] = gt.SpecialValues.NEGINF
df["upper"] = gt.SpecialValues.POSINF
df["scale"] = 1.0
m = gt.Container()
hrs = gt.Set(m, "h", records=df["h_0"].unique())
mins = gt.Set(m, "m", records=df["m_1"].unique())
secs = gt.Set(m, "s", records=df["s_2"].unique())
df["h_0"] = df["h_0"].astype(hrs.records["uni_0"].dtype)
df["m_1"] = df["m_1"].astype(mins.records["uni_0"].dtype)
df["s_2"] = df["s_2"].astype(secs.records["uni_0"].dtype)
a = gt.Equation(m, "a", "eq", domain=[hrs, mins, secs])
# set records
a.records = df
In [1]: e.isValid()
Out[1]: True
Attention
All numeric data in the records will need to be type float in order to maintain a valid symbol.

In this example we create a large equation (31,536,000 records and 8880 unique domain elements) and assign it to a variable with a.records. Transfer requires that all domain columns must be a categorical data type, furthermore this categorical must be ordered. The records setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True. As with Set and Parameters, users can use the .isValid(verbose=True) method to debug any structural issues.

Generate Equation Records

Generating the initial pandas.DataFrame object could be difficult for Equation symbols that have a large number of records and a small number of UELs – these higher dimensional symbols will benefit from the generateRecords convenience function. Internally, generateRecords computes the dense Cartesian product of all the domain sets that define a symbol (generateRecords will only work on symbols where <symbol>.domain_type == "regular").

Example #1 - Create a large (dense) 4D equation
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Equation(m, "a", "eq", [i, j, k, l])
# generate the records
a.generateRecords()
In [1]: a.isValid()
Out[1]: True
In [2]: a.records
Out[2]:
i_0 j_1 k_2 l_3 level marginal lower upper scale
0 i0 j0 k0 l0 0.470248 0.0 -inf inf 1.0
1 i0 j0 k0 l1 0.924286 0.0 -inf inf 1.0
2 i0 j0 k0 l2 0.347550 0.0 -inf inf 1.0
3 i0 j0 k0 l3 0.937009 0.0 -inf inf 1.0
4 i0 j0 k0 l4 0.050716 0.0 -inf inf 1.0
... ... ... ... ... ... ... ... ... ...
6249995 i49 j49 k49 l45 0.385032 0.0 -inf inf 1.0
6249996 i49 j49 k49 l46 0.029305 0.0 -inf inf 1.0
6249997 i49 j49 k49 l47 0.440716 0.0 -inf inf 1.0
6249998 i49 j49 k49 l48 0.432931 0.0 -inf inf 1.0
6249999 i49 j49 k49 l49 0.157107 0.0 -inf inf 1.0
[6250000 rows x 9 columns]
Note
In Example #1 a large 4D equation was generated – by default, only the level value of these records are randomly drawn from the interval [0,1] (uniform distribution). Other variable attributes take the default record value.

As with Variables, it is possible to generate a sparse variable with the densities argument to generateRecords. We extend this example by passing our own custom func argument that will control the behavior of the value columns. The func argument accepts a dict of callables (i.e., a reference to a function).

Example #2 - Create a large (sparse) 4D equation with normally distributed values
import gams.transfer as gt
import numpy as np
# create a custom function to pass to `generateRecords`
def level_dist(size):
return np.random.normal(loc=10.0, scale=2.3, size=size)
def marginal_dist(size):
return np.random.normal(loc=0.5, scale=0.1, size=size)
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(50)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(50)])
k = gt.Set(m, "k", records=[f"k{i}" for i in range(50)])
l = gt.Set(m, "l", records=[f"l{i}" for i in range(50)])
# create and define the symbol `a` with `regular` domains
a = gt.Equation(m, "a", "eq", [i, j, k, l])
# generate the records
a.generateRecords(densities=0.05, func={"level":level_dist, "marginal":marginal_dist})
In [1]: a.isValid()
Out[1]: True
In [12]: a.records
Out[12]:
i_0 j_1 k_2 l_3 level marginal lower upper scale
0 i0 j0 k0 l36 11.105235 0.468989 -inf inf 1.0
1 i0 j0 k0 l40 5.697361 0.478019 -inf inf 1.0
2 i0 j0 k1 l17 11.900784 0.473814 -inf inf 1.0
3 i0 j0 k1 l24 10.105931 0.456925 -inf inf 1.0
4 i0 j0 k1 l31 8.444142 0.490966 -inf inf 1.0
... ... ... ... ... ... ... ... ... ...
312495 i49 j49 k47 l17 11.523186 0.508001 -inf inf 1.0
312496 i49 j49 k47 l20 9.341183 0.739237 -inf inf 1.0
312497 i49 j49 k47 l26 10.705808 0.581103 -inf inf 1.0
312498 i49 j49 k47 l32 7.910963 0.479655 -inf inf 1.0
312499 i49 j49 k49 l8 11.800414 0.628040 -inf inf 1.0
[312500 rows x 9 columns]
In [3]: a.records["level"].mean()
Out[3]: 10.004072307451391
In [4]: a.records["level"].std()
Out[4]: 2.292569938350144
In [5]: a.records["marginal"].mean()
Out[5]: 0.49970172269778
In [6]: a.records["marginal"].std()
Out[6]: 0.09998772109802055
Note
The custom callable function reference must expose a size argument. It might be tedious to know the exact number of the records that will be generated, especially if a fractional density is specified; therefore, the generateRecords method will pass in the correct size automatically. Users are encouraged to use the Numpy suite of random distributions when generating samples – custom functions have the potential to be computationally burdensome if a symbol has a large number of records.

Alias

There are two different ways to create a GAMS alias and add it to a Container.

  1. Use Alias constructor
  2. Use the Container method addAlias (which internally calls the Alias constructor)

Constructor

Constructor Arguments
Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
alias_with Set object set object from which to create an alias Yes -
Example - Creating an alias from a set

Transfer only stores the reference to the parent set as part of the alias structure – most properties that are called from an alias object simply point to the properties of the parent set (with the exception of ref_container, name, and alias_with). It is possible to create an alias from another alias object. In this case a recursive search will be performed to find the root parent set – this is the set that will ultimately be stored as the alias_with property. We can see this behavior in the following example:

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i" + str(i) for i in range(5)])
ip = gt.Alias(m, "ip", i)
ipp = gt.Alias(m, "ipp", ip)
In [1]: ip.alias_with.name
Out[1]: 'i'
In [2]: ipp.alias_with.name
Out[2]: 'i'

Properties

Property Description Type Special Setter Behavior
alias_with aliased object Set -
description description of symbol str -
dimension dimension of symbol int setting is a shorthand notation to create ["*"] * n domains in symbol
domain_forwarding flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) bool no effect after records have been set
domain list of domains given either as string (* for universe set) or as reference to the Set/Alias object list -
domain_labels column headings for the records DataFrame list of str -
domain_names string version of domain names list of str -
domain_type none, relaxed or regular depending on state of domain links str -
is_singleton if symbol is a singleton set bool -
modified Flag that identifies if the Set has been modified bool -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(self.records) if not None) int -
records the main symbol records pandas.DataFrame responsive to domain_forwarding state
ref_container reference to the Container that the symbol belongs to Container -
summary output a dict of only the metadata dict -

Methods

Method Description Arguments/Defaults Returns
equals Used to compare the symbol to another symbol. If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_element_text=True then check that all set elements have the same descriptive element text, otherwise skip. If check_meta_data=True then check that symbol name and description are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) are ignored for set symbols. If verbose=True will return an exception from the asserter describing the nature of the difference. columns (ignored)
check_uels=True (bool)
check_element_text=True (bool)
check_meta_data=True (bool)
rtol=0.0 (ignored)
atol=0.0 (ignored)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for aliases. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
addUELs adds UELs to the parent set dimensions. If dimensions is None then add UELs to all dimensions. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
None
getUELs gets UELs from the parent set dimensions. If dimensions is None then get UELs from all dimensions (maintains order). The argument codes accepts a list of str UELs and will return the corresponding int; must specify a single dimension if passing codes. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. dimensions=None (int, list, None)
codes=None (int, list, None)
ignore_unused=False (bool)
list
setUELs set the UELs for parent set dimensions. If dimensions is None then set UELs for all dimensions. If rename=True, then the old UEL names will be renamed with the new UEL names. ** All trailing whitespace is trimmed ** uels (str, list)
dimensions=None (int, list, None)
rename=False (bool)
None
removeUELs removes UELs that appear in the parent set dimensions, If uels is None then remove all unused UELs (categories). If dimensions is None then operate on all dimensions. uels=None (str, list, None)
dimensions=None (int, list, None)
bool
renameUELs renames UELs (case-sensitive) that appear in the parent set dimensions. If dimensions is None then operate on all dimensions of the symbol. If allow_merge=True, the categorical object will be re-created to offer additional data flexibility. ** All trailing whitespace is trimmed ** uels (str, list, dict)
dimensions (int, list, None)
allow_merge=False (bool)
None
reorderUELs reorders the UELs in the parent set dimensions. If dimensions is None then reorder UELs in all dimensions of the parent set. uels (str, list, dict)
dimensions (int, list, None)
None
hasDomainViolations returns True if there are domain violations in the records of the parent set, returns False if not. - bool
countDomainViolations returns the count of how many records in the parent set contain at least one domain violation - int
dropDomainViolations drop records from the parent set that contain a domain violation - None
findDomainViolations get a view of the records DataFrame that contain any domain violations - pandas.DataFrame
getDomainViolations returns a list of DomainViolation objects if any (None otherwise) - list or None
hasDuplicateRecords returns True if there are (case insensitive) duplicate records in the parent set, returns False if not. - bool
countDuplicateRecords returns the count of how many (case insensitive) duplicate records exist in the parent set - int
dropDuplicateRecords drop records with (case insensitive) duplicate domains from the parent set – keep argument can take values of "first" (keeps the first instance of a duplicate record), "last" (keeps the last instance of a record), or False (drops all duplicates including the first and last) keep="first" None
findDuplicateRecords get a view of the records DataFrame from the parent set that contain any (case insensitive) duplicate domains – keep argument can take values of "first" (finds all duplicates while keeping the first instance as unique), "last" (finds all duplicates while keeping the last instance as unique), or False (finds all duplicates) keep="first" pandas.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, re-check a symbol if force=True verbose=False
force=True
bool
setRecords main convenience method to set standard pandas.DataFrame formatted records records (many types) None

Adding Alias Records

The linked structure of Aliases offers some unique opportunities to access some of the setter functionality of the parent set. Specifically, Transfer allows the user to change the domain, description, dimension, and records of the underlying parent set as a shorthand notation. We can see this behavior if we look at a modified Example #1 from Adding Set Records.

Example - Creating set records through an alias link
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i")
ip = gt.Alias(m, "ip",i)
ip.description = "adding new descriptive set text"
ip.domain = ["*", "*"]
ip.setRecords([("i" + str(i), "j" + str(j)) for i in range(3) for j in range(3)])
In [1]: i.description
Out[1]: 'adding new descriptive set text'
In [2]: i.domain
Out[2]: ['*', '*']
In [3]: i.records
Out[3]:
uni_0 uni_1 element_text
0 i0 j0
1 i0 j1
2 i0 j2
3 i1 j0
4 i1 j1
5 i1 j2
6 i2 j0
7 i2 j1
8 i2 j2
Note
An alias .isValid()=True when the underlying parent set is also valid – if the parent set is removed from the Container the alias will no longer be valid.

UniverseAlias

There are two different ways to create a GAMS UniverseAlias (an alias to the universe) and add it to a Container.

  1. Use UniverseAlias constructor
  2. Use the Container method addUniverseAlias (which internally calls the UniverseAlias constructor)

Constructor

Constructor Arguments
Argument Type Description Required Default
container Container A reference to the Container object that the symbol is being added to Yes -
name str Name of symbol Yes -
Example - Creating an alias to the universe

In GAMS it is possible to create aliases to the universe (i.e., the entire list of UELs) with the syntax:

set i / i1, i2 /;
alias(h,*);
set j / j1, j2 /;

In this small example, h would be associated with all four UELs (i1, i2, j1 and j2) even though set j was defined after the alias declaration. Transfer mimics this behavior with the UniverseAlias class. Internally, the records attribute will always call the <Container>.getUELs() and build the Pandas DataFrame on the fly. The UniverseAlias class is fundamentally different from the Alias class because it does not point to a parent set at all; it is not possible to perform operations (like setRecords or findDomainViolations) on the parent set through a UniverseAlias (because there is no parent set). This means that a UniverseAlias can be created by only defining the symbol name. We can see this behavior in the following example:

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2"])
h = gt.UniverseAlias(m, "h")
j = gt.Set(m, "j", records=["j1", "j2"])
# -- alternative syntax --
# m = gt.Container()
# m.addSet("i", records=["i1", "i2"])
# m.addUniverseAlias("h")
# m.addSet("j", records=["j1", "j2"])
In [1]: m.data
Out[1]: {'i': <src.gamstransfer.Set object at 0x7fc5e0cbbf70>, 'h': <src.gamstransfer.UniverseAlias object at 0x7fc5e0cbb760>, 'j': <src.gamstransfer.Set object at 0x7fc5c060d630>}
In [2]: h.records
Out[2]:
0 i1
1 i2
2 j1
3 j2
Note
Unlike other sets, the universe does not hold on to set element_text, thus the returned DataFrame for the UniverseAlias will only have 1 column.

Properties

Property Description Type Special Setter Behavior
alias_with always * str -
description always Aliased with * str -
dimension always 1 int -
domain always ["*"] list of str -
domain_labels always ["*"] list of str -
domain_names always ["*"] list of str -
domain_type always none str -
is_singleton always False bool -
modified flag that identifies if the UniverseAlias has been modified bool -
name name of symbol str sets the GAMS name of the symbol
number_records number of symbol records (i.e., returns len(records) if not None) int -
records the main symbol records pandas.DataFrame -
ref_container reference to the Container that the symbol belongs to Container -
summary output a dict of only the metadata dict -

Methods

Method Description Arguments/Defaults Returns
equals Used to compare the symbol to another symbol. If check_uels=True then check both used and unused UELs and confirm same order, otherwise only check used UELs in data and do not check UEL order. If check_element_text=True then check that all set elements have the same descriptive element text, otherwise skip. If check_meta_data=True then check that symbol name and description are the same, otherwise skip. rtol (relative tolerance) and atol (absolute tolerance) are ignored for set symbols. If verbose=True will return an exception from the asserter describing the nature of the difference. columns (ignored)
check_uels=True (bool)
check_element_text=True (bool)
check_meta_data=True (bool)
rtol=0.0 (ignored)
atol=0.0 (ignored)
verbose=False (bool)
bool
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for aliases. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
getUELs gets UELs from the Container. Returns only UELs in the data if ignore_unused=True, otherwise return all UELs. ignore_unused=False (bool) list
getCardinality returns len(records) - int or None
getSparsity always 0.0 - float
isValid checks if the symbol is in a valid format, throw exceptions if verbose=True, re-check a symbol if force=True verbose=False
force=True
bool

DomainViolation

DomainViolation objects are convenient containers that store information about the location of domain violations in a symbol. These objects are computed dynamically with the getDomainViolations method and should not be instantiated by the user (they are read-only, to the extent that this is possible in Python). However, the user may be interested in some of the information that they contain.

Constructor

Constructor Arguments/Properties
Argument Type Description Required Default
symbol _Symbol A reference to the _Symbol object that has a domain violation Yes -
dimension int An index to the dimension of the symbol where the domain violation exists Yes -
domain Set, Alias or UniverseAlias A reference to the symbol domain that is the source of the domain violation Yes -
violations list A list of all the domain elements that are causing violations Yes -

ConstContainer (Rapid Read)

In the Container section we describe how to use the main object class of Transfer – the Container. Many users of Transfer will rely on the Container for building their data pipeline, however some users will only be interested in post-processing data from a GAMS model run. This one-directional flow of data means that these users do not need some of the advanced Container features such as domain linking, matrix generation, domain checking, etc. The ConstContainer (i.e., a Constant Container) object class is a data-focused read-only object that will provide a snapshot of the data target being read – the ConstContainer can be created by reading a GDX file or a GamsDatabase/GMD object (an in memory representation of data used e.g. in embedded Python code).

The ConstContainer shares many of the same methods and attributes that are in the Container class, which makes moving between the ConstContainer and the Container very simple. There are some important differences though:

  1. The ConstContainer does not link any symbol data
  2. The ConstContainer can only read from one source at a time – every new call of .read() will clear the data dictionary
  3. The ConstContainer constructor will not read in any symbol records – this enables users to browse an unknown data source quickly (similar behavior to gdxdump).
  4. The ConstContainer does not have a .write() method – a ConstContainer can be passed to the constructor of a Container which will enable data writing (however a copy of the data will be generated).
  5. The user will never need to instantiate a symbol object and add it to the ConstContainer – the ConstContainer will internally generate its own set of (simplified) symbol classes and hold them in the .data attribute.

All of these differences were inspired by users that want to read the data as fast as possible and probe unknown data files without worrying about memory issues – ConstContainer provides users with a high level view of the data very quickly.

ConstContainer constructor

Creating a ConstContainer is a simple matter of initializing an object. For example:

import gams.transfer as gt
h = gt.ConstContainer("out.gdx")
Note
This new ConstContainer object, here called h, will load all the symbol data from out.gdx but it will not load any of the records. To load records, users must use the .read() method.

The ConstContainer constructor arguments are:

Argument Type Description Required Default
load_from str or GamsDatabase/GMD Object Points to the source of the data being read into the ConstContainer No None
system_directory str Absolute path to GAMS system_directory No Attempts to find the GAMS installation by creating a GamsWorkspace object and loading the system_directory attribute.

The ConstContainer contains many of the same methods that are in the Container class, specifically:

ConstContainer Methods
Method Description Arguments/Defaults Returns
describeAliases create a summary table with descriptive statistics for Aliases symbols=None (None, str, list) - if None, assumes all aliases pandas.DataFrame
describeParameters create a summary table with descriptive statistics for Parameters symbols=None (None, str, list) - if None, assumes all parameters pandas.DataFrame
describEquations create a summary table with descriptive statistics for Equations symbols=None (None, str, list) - if None, assumes all equations pandas.DataFrame
describeSets create a summary table with descriptive statistics for Sets symbols=None (None, str, list) - if None, assumes all sets pandas.DataFrame
describeVariables create a summary table with descriptive statistics for Variables symbols=None (None, str, list) - if None, assumes all variables pandas.DataFrame
listAliases list all aliases - list
listEquations list all equations types=None (list of equation types) - if None, assumes all types list
listParameters list all parameters - list
listSets list all sets - list
listSymbols list all symbols - list
listVariables list all variables types=None (list of variable types) - if None, assumes all types list
read main method to read load_from, can be provided with a list of symbols to read in subsets, records controls if symbol records are loaded or just metadata load_from (str,GMD Object Handle,GamsDatabase Object)
symbols="all" (str, list)
records=True (bool)
None

The structure of the DataFrames that are returned from the describe* methods mirrors that in the Container; the user should reference Describing Data for detailed descriptions of the columns.

ConstContainer Symbol Objects

The ConstContainer uses a simplified symbol class structure to hold symbol specific information. The user will never need to directly instantiate these symbol classes (called _ConstSet, _ConstParameter, _ConstVariable, _ConstEquation, _ConstAlias and _ConstUniverseAlias); the class names use the leading _ symbol to reinforce that these classes private (and the user should not need to create these objects directly). This class structure is used to provide the feel of a read-only object.

While users do not need to instantiate any of the _Const* symbol objects directly, they are available for users to probe. Many of the same Container symbol methods that generate summary statistics exist for the ConstContainer symbols. Specifically:

_ConstSet Properties
Property Description Type
description description of symbol str
dimension dimension of symbol int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
is_singleton bool if symbol is a singleton set bool
name name of symbol str
number_records number of symbol records (i.e., returns len(self.records) if not None) int
records the main symbol records pandas.DataFrame
summary output a dict of only the metadata dict
_ConstSet Methods
Method Description Arguments/Defaults Returns
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for sets. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
_ConstParameter Properties
Property Description Type
description description of symbol str
dimension dimension of symbol int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
is_scalar True if the len(self.domain) = 0 bool
name name of symbol str
number_records number of symbol records (i.e., returns len(self.records) if not None) int
records the main symbol records pandas.DataFrame
summary output a dict of only the metadata dict
_ConstParameter Methods
Method Description Arguments/Defaults Returns
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. The argument value is ignored for parameters. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
countEps total number of SpecialValues.EPS across all columns - int or None
countNA total number of SpecialValues.NA across all columns - int or None
countNegInf total number of SpecialValues.NEGINF across all columns - int or None
countPosInf total number of SpecialValues.POSINF across all columns - int or None
countUndef total number of SpecialValues.UNDEF across all columns - int or None
findEps find positions of SpecialValues.EPS in value column - pandas.DataFrame or None
findNA find positions of SpecialValues.NA in value column - pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in value column - pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in value column - pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in value column - pandas.DataFrame or None
getMaxValue get the maximum value across all columns - float or None
getMinValue get the minimum value across all columns - float or None
getMeanValue get the mean value across all columns - float or None
getMaxAbsValue get the maximum absolute value across all columns - float or None
whereMax find the domain entry of records with a maximum value (return first instance only) - list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) - list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) - list of str or None
_ConstVariable Properties
Property Description Type
description description of symbol str
dimension dimension of symbol int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
name name of symbol str
number_records number of symbol records (i.e., returns len(self.records) if not None) int
records the main symbol records pandas.DataFrame
summary output a dict of only the metadata dict
type str type of variable str
_ConstVariable Methods
Method Description Arguments/Defaults Returns
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. If value is None then the level values will be pivoted. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
value (str)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
countEps total number of SpecialValues.EPS across all columns columns="level" (str, list) int or None
countNA total number of SpecialValues.NA across all columns columns="level" (str, list) int or None
countNegInf total number of SpecialValues.NEGINF across all columns columns="level" (str, list) int or None
countPosInf total number of SpecialValues.POSINF across all columns columns="level" (str, list) int or None
countUndef total number of SpecialValues.UNDEF across all columns columns="level" (str, list) int or None
findEps find positions of SpecialValues.EPS in column column="level" (str) pandas.DataFrame or None
findNA find positions of SpecialValues.NA in column column="level" (str) pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in column column="level" (str) pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in column column="level" (str) pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in column column="level" (str) pandas.DataFrame or None
getMaxValue get the maximum value across all columns columns="level" (str, list) float or None
getMinValue get the minimum value across all columns columns="level" (str, list) float or None
getMeanValue get the mean value across all columns columns="level" (str, list) float or None
getMaxAbsValue get the maximum absolute value across all columns columns="level" (str, list) float or None
whereMax find the domain entry of records with a maximum value (return first instance only) column="level" (str) list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) column="level" (str) list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) column="level" (str) list of str or None
_ConstEquation Properties
Property Description Type
description description of symbol str
dimension dimension of symbol int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
name name of symbol str
number_records number of symbol records (i.e., returns len(self.records) if not None) int
records the main symbol records pandas.DataFrame
summary output a dict of only the metadata dict
type str type of variable str
_ConstEquation Methods
Method Description Arguments/Defaults Returns
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. If value is None then the level values will be pivoted. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
value (str)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
countEps total number of SpecialValues.EPS across all columns columns="level" (str, list) int or None
countNA total number of SpecialValues.NA across all columns columns="level" (str, list) int or None
countNegInf total number of SpecialValues.NEGINF across all columns columns="level" (str, list) int or None
countPosInf total number of SpecialValues.POSINF across all columns columns="level" (str, list) int or None
countUndef total number of SpecialValues.UNDEF across all columns columns="level" (str, list) int or None
findEps find positions of SpecialValues.EPS in column column="level" (str) pandas.DataFrame or None
findNA find positions of SpecialValues.NA in column column="level" (str) pandas.DataFrame or None
findNegInf find positions of SpecialValues.NEGINF in column column="level" (str) pandas.DataFrame or None
findPosInf find positions of SpecialValues.POSINF in column column="level" (str) pandas.DataFrame or None
findUndef find positions of SpecialValues.Undef in column column="level" (str) pandas.DataFrame or None
getMaxValue get the maximum value across all columns columns="level" (str, list) float or None
getMinValue get the minimum value across all columns columns="level" (str, list) float or None
getMeanValue get the mean value across all columns columns="level" (str, list) float or None
getMaxAbsValue get the maximum absolute value across all columns columns="level" (str, list) float or None
whereMax find the domain entry of records with a maximum value (return first instance only) column="level" (str) list of str or None
whereMaxAbs find the domain entry of records with a maximum absolute value (return first instance only) column="level" (str) list of str or None
whereMin find the domain entry of records with a minimum value (return first instance only) column="level" (str) list of str or None
_ConstAlias Properties
Property Description Type
alias_with parent set str
description description from the parent set str
dimension dimension of parent set int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
is_singleton bool if symbol is a singleton set bool
name name of symbol str
number_records number of symbol records from the parent set (i.e., returns len(self.records) if not None) int
records the main symbol records from the parent set pandas.DataFrame
summary output a dict of only the metadata from the parent set dict
_ConstAlias Methods
Method Description Arguments/Defaults Returns
pivot Convenience function to pivot records into a new shape (only symbols with >1D can be pivoted). If index is None then it is set to dimensions [0..dimension-1]. If columns is None then it is set to the last dimension. If value is None then the level values will be pivoted. Missing values in the pivot will take the value provided by fill_value index=None (str, list, None)
columns=None (str, list, None)
value (str)
fill_value=None (int, float, str)
pd.DataFrame
getCardinality get the full Cartesian product of the domain - int or None
getSparsity get the sparsity of the symbol w.r.t the cardinality - float or None
_ConstUniverseAlias Properties
Property Description Type
alias_with parent set str
description description from the parent set str
dimension dimension of parent set int
domain_labels column headings for the records DataFrame list of str
domain_names string version of domain names list of str
domain_type none, relaxed or regular depending on state of domain links str
is_singleton bool if symbol is a singleton set bool
name name of symbol str
number_records number of symbol records from the parent set (i.e., returns len(self.records) if not None) int
records the main symbol records from the parent set pandas.DataFrame
summary output a dict of only the metadata from the parent set dict
_ConstUniverseAlias Methods
Method Description Arguments/Defaults Returns
getCardinality get the full Cartesian product of the domain (by definition, this value is len(records)). - int
getSparsity get the sparsity of the symbol. By definition this value is always 0.0 for _ConstUniverseAlias. - 0.0
import gams.transfer as gt
h = gt.ConstContainer("trnsport.gdx")
In [1]: h.data
Out[1]:
{'i': <src.gamstransfer._ConstSet at 0x7fba484bf5b0>,
'j': <src.gamstransfer._ConstSet at 0x7fba484bffd0>,
'a': <src.gamstransfer._ConstParameter at 0x7fba484bf880>,
'b': <src.gamstransfer._ConstParameter at 0x7fba48b0f460>,
'd': <src.gamstransfer._ConstParameter at 0x7fba48b0ff40>,
'f': <src.gamstransfer._ConstParameter at 0x7fba48b0fa00>,
'c': <src.gamstransfer._ConstParameter at 0x7fba48b0f160>,
'x': <src.gamstransfer._ConstVariable at 0x7fba48b0f7c0>,
'z': <src.gamstransfer._ConstVariable at 0x7fba48b0f3a0>,
'cost': <src.gamstransfer._ConstEquation at 0x7fba48b0f7f0>,
'supply': <src.gamstransfer._ConstEquation at 0x7fba48b0fd30>,
'demand': <src.gamstransfer._ConstEquation at 0x7fba48b0fc70>}
In [2]: h.describeParameters()
Out[2]:
name is_scalar domain domain_type dim num_recs sparsity min_value mean_value max_value where_min where_max count_eps count_na count_undef
0 a False [i] regular 1 2 0.0 None None None None None None None None
1 b False [j] regular 1 3 0.0 None None None None None None None None
2 c False [i, j] regular 2 6 0.0 None None None None None None None None
3 d False [i, j] regular 2 6 0.0 None None None None None None None None
4 f True [] none 0 1 0.0 None None None None None None None None

Note that in this example we make use of the convenience notation contained in the constructor to read in only the metadata of the trnsport.gdx file. This allows users to quickly explore the symbols contained in a file (or in-memory object) and it also explains why there are many None values in the columns of the .describeParameters() method.

Example (reading all data w/ ConstContainer.read() method)
import gams.transfer as gt
h = gt.ConstContainer()
h.read("trnsport.gdx")
In [1]: h.data
Out[1]:
{'i': <src.gamstransfer._ConstSet at 0x7fba484bf5b0>,
'j': <src.gamstransfer._ConstSet at 0x7fba484bffd0>,
'a': <src.gamstransfer._ConstParameter at 0x7fba484bf880>,
'b': <src.gamstransfer._ConstParameter at 0x7fba48b0f460>,
'd': <src.gamstransfer._ConstParameter at 0x7fba48b0ff40>,
'f': <src.gamstransfer._ConstParameter at 0x7fba48b0fa00>,
'c': <src.gamstransfer._ConstParameter at 0x7fba48b0f160>,
'x': <src.gamstransfer._ConstVariable at 0x7fba48b0f7c0>,
'z': <src.gamstransfer._ConstVariable at 0x7fba48b0f3a0>,
'cost': <src.gamstransfer._ConstEquation at 0x7fba48b0f7f0>,
'supply': <src.gamstransfer._ConstEquation at 0x7fba48b0fd30>,
'demand': <src.gamstransfer._ConstEquation at 0x7fba48b0fc70>}
In [2]: h.describeParameters()
Out[2]:
name is_scalar domain domain_type dim num_recs sparsity min_value mean_value max_value where_min where_max count_eps count_na count_undef
0 a False [i] regular 1 2 0.0 350.000 475.000 600.000 [seattle] [san-diego] 0 0 0
1 b False [j] regular 1 3 0.0 275.000 300.000 325.000 [topeka] [new-york] 0 0 0
2 c False [i, j] regular 2 6 0.0 0.126 0.176 0.225 [san-diego, topeka] [seattle, new-york] 0 0 0
3 d False [i, j] regular 2 6 0.0 1.400 1.950 2.500 [san-diego, topeka] [seattle, new-york] 0 0 0
4 f True [] none 0 1 0.0 90.000 90.000 90.000 None None 0 0 0

In this example we make use of the .read() method to retrieve both the metadata and the numerical records for all symbols in the GDX file – the .describeParameters() method will now populate the DataFrame with additional summary statistics.

Additional Topics

Validating Data

Transfer requires that the records for all symbols exist in a standard format (Standard Data Formats) in order for them to be understood by the Container. It is certainly possible that the data could end up in a state that is inconsistent with the standard format (especially if setting symbol attributes directly). Transfer includes the .isValid() method in order to determine if a symbol is structurally valid – this method returns a bool. This method does not guarantee that a symbol will be successfully written to either GDX or GMD, other data errors (duplicate records, long UEL names, or domain violations) could exist that are not tested in .isValid().

For example, we create two valid sets and then check them with .isValid() to be sure.

Note
It is possible to run .isValid() on both the Container as well as the symbol object – .isValid() will also return a bool if there are any invalid symbols in the Container object.
Example (valid data)
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego", "washington_dc"])
j = gt.Set(m, "j", i, records=["san-diego", "washington_dc"])
In [1]: i.isValid()
Out[1]: True
In [2]: j.isValid()
Out[2]: True
In [3]: m.isValid()
Out[3]: True

The .isValid() method checks:

  1. If the symbol belongs to a Container
  2. If all domain set symbols exist in the Container
  3. If all domain set symbols objects are valid
  4. If records are a DataFrame (or None)
  5. The shape of the records is congruent with the dimensionality of the symbol
  6. If records column headings are in standard format
  7. If all domain columns are type category
  8. Checks if all domain categories are type str
  9. That all data columns are type float

Comparing Symbols

Sparse GAMS data is inherently unordered. The concept of order is GAMS is governed by the order of the UELs in the universe set not the order of the records. This differs from the sparse data structures that we use in Transfer (Pandas DataFrames) because each record (i.e., DataFrame row) has an index (typically 0..n) and can be sorted by this index. Said a slightly different way, two GDX files will be equivalent if their universe order is the same and the records are the same, however when creating the GDX file, it is of no consequence what order the records are written in. Therefore, in order to calculate an equality between two symbols in Transfer we must perform a merge operation on the symbol domain labels – an operation that could be computationally expensive for large symbols.

Attention
The nature of symbol equality in Transfer means that a potentially expensive merge operation is performed, we do not recommend that the equals method be used inside loops or when speed is critical. It is, however, very useful for data debugging.

A quick example shows the syntax of equals:

m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(5)], description="set i")
j = gt.Set(m, "j", records=[f"i{i}" for i in range(5)], description="set j")
In [1]: i.equals(j)
Out[1]: False

By default, equals takes the strictest view of symbol "equality" – everything must be the same. In this case, the symbol names and descriptions differ between the two sets i and j. We can relax the view of equality with a combination of argument flags. Comparing the two symbols again, but ignoring the meta data (i.e., ignoring the symbol name, description and type (if a Variable or Equation)):

In [1]: i.equals(j, check_meta_data=False)
Out[1]: True

It is also possible to ignore the set element text in equals:

m = gt.Container()
i = gt.Set(m, "i", records=[(f"i{i}", "arlington") for i in range(5)])
j = gt.Set(m, "j", records=[f"i{i}" for i in range(5)])
In [1]: i.records
Out[1]:
uni_0 element_text
0 i0 arlington
1 i1 arlington
2 i2 arlington
3 i3 arlington
4 i4 arlington
In [2]: j.records
Out[2]:
uni_0 element_text
0 i0
1 i1
2 i2
3 i3
4 i4
In [3]: i.equals(j, check_meta_data=False, check_element_text=False)
Out[3]: True

The check_uels argument will ensure that the symbol "universe" is the same (in order and content) between two symbols, as illustrated in the following example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
ip = gt.Set(m, "ip", records=["i3", "i2", "i1"])

Clearly, the two sets i and ip have the same records, but the UEL order is different. If check_uels=True the resulting symbols will not be considered equal – turning this flag off results in equality.

In [1]: i.getUELs()
Out[1]: ['i1', 'i2', 'i3']
In [2]: ip.getUELs()
Out[2]: ['i3', 'i2', 'i1']
In [3]: i.equals(ip, check_meta_data=False)
Out[3]: False
In [4]: i.equals(ip, check_meta_data=False, check_uels=False)
Out[4]: True

Numerical comparisons are enabled for Parameters, Variables and Equations – equality can be flexibly defined through the equals method arguments. Again, the strictest view of equality is taken as the default behavior of equals (no numerical tolerances, some limitations exist – see: numpy.isclose for more details).

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
a = gt.Parameter(m, "a", i, records=[("i1", 1), ("i2", 2), ("i3", 3)])
ap = gt.Parameter(m, "ap", i, records=[("i1", 1 + 1e-9), ("i2", 2), ("i3", 3)])
In [1]: a.equals(ap, check_meta_data=False)
Out[1]: False
In [2]: a.equals(ap, check_meta_data=False, atol=1e-8)
Out[2]: True
Attention
The numerical comparison is handled by numpy.isclose, more details can be found in the Numpy documentation.

In the case of variables and equations, it is possible for the user to confine the numerical comparison to certain certain attributes (level, marginal, lower, upper and scale) by specifying the columns argument as the following example illustrates:

m = gt.Container()
a = gt.Variable(m, "a", "free", records=100)
ap = gt.Variable(m, "ap", "free", records=101)
In [1]: a.records
Out[1]:
level marginal lower upper scale
0 100.0 0.0 -inf inf 1.0
In [2]: ap.records
Out[2]:
level marginal lower upper scale
0 101.0 0.0 -inf inf 1.0
In [3]: a.equals(ap, check_meta_data=False)
Out[3]: False
In [4]: a.equals(ap, check_meta_data=False, columns="level")
Out[4]: False
In [5]: a.equals(ap, check_meta_data=False, columns="marginal")
Out[5]: True

Domain Forwarding

GAMS includes the ability to define sets directly from data using the implicit set notation (see: Implicit Set Definition (or: Domain Defining Symbol Declarations)). This notation has an analogue in Transfer called domain_forwarding.

Note
It is possible to recursively update a subset tree in Transfer.

Domain forwarding is available as an argument to all symbol object constructors; the user would simply need to pass domain_forwarding=True.

In this example we have raw data that in the dist DataFrame and we want to send the domain information into the i and j sets – we take care to pass the set objects as the domain for parameter c.

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i")
j = gt.Set(m, "j")
dist = pd.DataFrame(
[
("seattle", "new-york", 2.5),
("seattle", "chicago", 1.7),
("seattle", "topeka", 1.8),
("san-diego", "new-york", 2.5),
("san-diego", "chicago", 1.8),
("san-diego", "topeka", 1.4),
],
columns=["from", "to", "thousand_miles"],
)
c = gt.Parameter(m, "c", [i, j], records=dist, domain_forwarding=True)
In [1]: i.records
Out[1]:
uni_0 element_text
0 seattle
1 san-diego
In [2]: j.records
Out[2]:
uni_0 element_text
0 new-york
1 chicago
2 topeka
In [3]: c.records
Out[3]:
i_0 j_1 value
0 seattle new-york 2.5
1 seattle chicago 1.7
2 seattle topeka 1.8
3 san-diego new-york 2.5
4 san-diego chicago 1.8
5 san-diego topeka 1.4
Note
The element order in the sets i and j mirrors that in the raw data.

In this example we show that domain forwarding will also work recursively to update the entire set lineage – the domain forwarding occurs at the creation of every symbol object. The correct order of elements in set i is [z, a, b, c] because the records from j are forwarded first, and then the records from k are propagated through (back to i).

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i")
j = gt.Set(m, "j", i, records=["z"], domain_forwarding=True)
k = gt.Set(m, "k", j, records=["a", "b", "c"], domain_forwarding=True)
In [1]: i.records
Out[1]:
uni_0 element_text
0 z
1 a
2 b
3 c
In [2]: j.records
Out[2]:
i_0 element_text
0 z
1 a
2 b
3 c
In [3]: k.records
Out[3]:
j_0 element_text
0 a
1 b
2 c

Domain Violations

Domain violations occur when domain labels appear in symbol data but they do not appear in the parent set which the symbol is defined over – attempting to execute a GAMS model when there domain violations will lead to compilation errors. Domain violations are found dynamically with the <Symbol>.findDomainViolations() method.

Note
the findDomainViolations method can be computationally expensive – UELs in GAMS are case preserving (just like symbol names); additionally, GAMS ignores all trailing white space in UELs (leading white space is considered significant). As a result, Transfer must lowercase all UELs and then strip any trailing white space before doing the set comparison to locate (and create) any DomainViolation objects. findDomainViolations should not be used in a loop (nor should any of its related methods: hasDomainViolations, countDomainViolations, getDomainViolations, or dropDomainViolations).

In the following example we intentionally create data with domain violations in the a parameter:

m = gt.Container()
i = gt.Set(m, "i", records=["a", "b", "c"])
a = gt.Parameter(m, "a", i, records=[("aa", 1), ("c", 2)])
In [1]: a.findDomainViolations()
Out[1]:
i_0 value
0 aa 1.0
In [2]: a.hasDomainViolations()
Out[2]: True
In [3]: a.countDomainViolations()
Out[3]: 1
In [4]: a.getDomainViolations()
Out[4]: [<src.gamstransfer.DomainViolation at 0x7fb6b83d9630>]

Dynamically locating domain violations allows Transfer to return a view of the underlying pandas dataframe with the problematic domain labels still intact – at this point the user is free to correct issues in the UELs with any of the *UELs methods or by simply dropping any domain violations from the dataframe completely (the dropDomainViolations method is a convenience function for this operation).

Attention
It is not possible to create a GDX file if symbols have domain violations.
Unused UELs will not result in domain violations.

Attempting to write this container to a GDX file will result in an exception.

m = gt.Container()
i = gt.Set(m, "i", records=["a", "b", "c"])
a = gt.Parameter(m, "a", i, records=[("aa", 1), ("c", 2)])
m.write("out.gdx")
Exception: Encountered data errors with symbol `a`. Possible causes are from duplicate records and/or domain violations.
Use 'hasDuplicateRecords', 'findDuplicateRecords', 'dropDuplicateRecords', and/or 'countDuplicateRecords' to find/resolve duplicate records.
Use 'hasDomainViolations', 'findDomainViolations', 'dropDomainViolations', and/or 'countDomainViolations' to find/resolve domain violations.
GDX file was not created successfully.

Duplicate Records

Duplicate records can easily appear in large datasets – locating and fixing these records is straightforward with Transfer. Transfer includes find*, has*, count* and drop* methods for duplicate records, just as it has for domain violations.

Note
the findDuplicateRecords method can be computationally expensive – UELs in GAMS are case preserving (just like symbol names); additionally, GAMS ignores all trailing white space in UELs (leading white space is considered significant). As a result, Transfer must lowercase all UELs and then strip any trailing white space before doing the set comparison to locate duplicate records. findDuplicateRecords should not be used in a loop (nor should any of its related methods: hasDuplicateRecords, countDuplicateRecords, or dropDuplicateRecords).

Dynamically locating duplicate records allows Transfer to return a view of the underlying pandas dataframe with the problematic domain labels still intact – at this point the user is free to correct issues in the UELs with any of the *UELs methods or by simply dropping any duplicate records from the dataframe completely (the dropDuplicateRecords method is a convenience function for this operation).

m = gt.Container()
a = gt.Parameter(
m,
"a",
["*"],
records=[("i" + str(i), float(i)) for i in range(4)]
+ [("j" + str(i), i) for i in range(4)]
+ [("I" + str(i), i) for i in range(4)],
)
Note
The user can decide which duplicate records they would like keep with keep="first" (default), keep="last", or keep=False (which returns all duplicate records)
In [1]: a.records
Out[1]:
uni_0 value
0 i0 0.0
1 i1 1.0
2 i2 2.0
3 i3 3.0
4 j0 0.0
5 j1 1.0
6 j2 2.0
7 j3 3.0
8 I0 0.0
9 I1 1.0
10 I2 2.0
11 I3 3.0
In [2]: a.findDuplicateRecords()
Out[2]:
uni_0 value
8 I0 0.0
9 I1 1.0
10 I2 2.0
11 I3 3.0
In [3]: a.findDuplicateRecords(keep="last")
Out[3]:
uni_0 value
0 i0 0.0
1 i1 1.0
2 i2 2.0
3 i3 3.0
In [4]: a.findDuplicateRecords(keep=False)
Out[4]:
uni_0 value
0 i0 0.0
1 i1 1.0
2 i2 2.0
3 i3 3.0
8 I0 0.0
9 I1 1.0
10 I2 2.0
11 I3 3.0
Attention
It is not possible to create a GDX file if symbols have duplicate records.

Attempting to write this container to a GDX file will result in an exception.

m = gt.Container()
a = gt.Parameter(
m,
"a",
["*"],
records=[("i" + str(i), float(i)) for i in range(4)]
+ [("j" + str(i), i) for i in range(4)]
+ [("I" + str(i), i) for i in range(4)],
)
m.write("out.gdx")
Exception: Encountered data errors with symbol `a`. Possible causes are from duplicate records and/or domain violations.
Use 'hasDuplicateRecords', 'findDuplicateRecords', 'dropDuplicateRecords', and/or 'countDuplicateRecords' to find/resolve duplicate records.
Use 'hasDomainViolations', 'findDomainViolations', 'dropDomainViolations', and/or 'countDomainViolations' to find/resolve domain violations.
GDX file was not created successfully.

Pivoting Data

It might be convenient to pivot data into a multi-dimensional data structure rather than maintaining the flat structure in records. A convenience method called pivot is provided for all symbol classes and will return a pivoted pandas.DataFrame. Pivoting is only available for symbols with more than one dimension.

Example #1 - Pivot a 2D Set
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(5)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(5)])
ij = gt.Set(m, "ij", [i, j])
ij.generateRecords(densities=0.25, seed=123)
In [1]: ij.pivot()
Out[1]:
j0 j1 j3 j4
i0 True True False False
i1 True False False False
i2 False False True True
i4 False True False False
Example #2 - Pivot a 3D Set
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(5)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(5)])
iji = gt.Set(m, "iji", [i, j, i])
iji.generateRecords(densities=0.25, seed=123)
In [1]: iji.pivot()
Out[1]:
i0 i1 i2 i3 i4
i0 j0 False True True False False
j1 True False False False False
j3 False False False True False
j4 False False True False False
i1 j0 True True False True False
j1 True True False False True
j2 False True False False False
j4 False False False True False
i2 j0 True False False False False
j1 False False True False True
j3 True False False False False
i3 j2 False True True False True
j3 False True False False False
j4 True False True True True
i4 j0 False True False True False
j1 False False False True False
j3 False False False True False
j4 False False False True True
In [2]: iji.pivot(fill_value="")
Out[2]:
i0 i1 i2 i3 i4
i0 j0 True True
j1 True
j3 True
j4 True
i1 j0 True True True
j1 True True True
j2 True
j4 True
i2 j0 True
j1 True True
j3 True
i3 j2 True True True
j3 True
j4 True True True True
i4 j0 True True
j1 True
j3 True
j4 True True
Note
When pivoting symbols with >2 dimensions, the first [0..(dimension-1)] dimensions will be set to the index and the last dimension will be pivoted into the columns. This behavior can be customized with the index and columns arguments.
Example #3 - Pivot a 3D Parameter w/ a fill_value
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(5)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(5)])
iji = gt.Parameter(m, "iji", [i, j, i])
iji.generateRecords(densities=0.05, seed=123)
In [1]: iji.pivot(fill_value="NONE")
Out[1]:
i1 i2 i3 i4
i0 j1 0.682352 NONE NONE NONE
j2 0.053821 NONE 0.22036 NONE
i1 j1 NONE NONE NONE 0.184372
i2 j0 NONE 0.175906 NONE NONE
i3 j4 NONE NONE 0.812095 NONE
In [2]: iji.pivot(fill_value=0)
Out[2]:
i1 i2 i3 i4
i0 j1 0.682352 0.000000 0.000000 0.000000
j2 0.053821 0.000000 0.220360 0.000000
i1 j1 0.000000 0.000000 0.000000 0.184372
i2 j0 0.000000 0.175906 0.000000 0.000000
i3 j4 0.000000 0.000000 0.812095 0.000000
In [3]: iji.pivot(fill_value=gt.SpecialValues.EPS)
Out[3]:
i1 i2 i3 i4
i0 j1 0.682352 -0.000000 -0.000000 -0.000000
j2 0.053821 -0.000000 0.220360 -0.000000
i1 j1 -0.000000 -0.000000 -0.000000 0.184372
i2 j0 -0.000000 0.175906 -0.000000 -0.000000
i3 j4 -0.000000 -0.000000 0.812095 -0.000000
Example #4 - Pivot (only the marginal values) of a 3D Variable
import gams.transfer as gt
# NOTE: custom functions should expose a 'seed' argument
def marginal_values(seed, size):
rng = np.random.default_rng(seed)
return rng.normal(5, 1.2, size=size)
m = gt.Container()
i = gt.Set(m, "i", records=[f"i{i}" for i in range(5)])
j = gt.Set(m, "j", records=[f"j{i}" for i in range(5)])
iji = gt.Variable(m, "iji", "free", [i, j, i])
iji.generateRecords(densities=0.05, func={"marginal": marginal_values}, seed=123)
In [1]: iji.records
Out[1]:
i_0 j_1 i_2 level marginal lower upper scale
0 i0 j1 i1 0.0 3.813054 -inf inf 1.0
1 i0 j2 i1 0.0 4.558656 -inf inf 1.0
2 i0 j2 i3 0.0 6.545510 -inf inf 1.0
3 i1 j1 i4 0.0 5.232769 -inf inf 1.0
4 i2 j0 i2 0.0 6.104277 -inf inf 1.0
5 i3 j4 i3 0.0 5.692525 -inf inf 1.0
In [2]: iji.pivot(value="marginal")
Out[2]:
i1 i3 i4 i2
i0 j1 3.813054 0.000000 0.000000 0.000000
j2 4.558656 6.545510 0.000000 0.000000
i1 j1 0.000000 0.000000 5.232769 0.000000
i2 j0 0.000000 0.000000 0.000000 6.104277
i3 j4 0.000000 5.692525 0.000000 0.000000

Describing Data

The methods describeSets, describeParameters, describeVariables, and describeEquations allow the user to get a summary view of key data statistics. The returned DataFrame aggregates the output for a number of other methods (depending on symbol type). A description of each Container method is provided in the following subsections:

describeSets

Argument Type Description Required Default
symbols list, str, NoneType A list of sets in the Container to include in the output. describeSets will include aliases if they are explicitly passed by the user. No None (if None specified, will assume all sets – not aliases)

Returns: pandas.DataFrame

The following table includes a short description of the column headings in the return.

Property / Statistic Description
name name of the symbol
is_singleton bool if the set/alias is a singleton set (or an alias of a singleton set)
alias_with [OPTIONAL if users passes an alias name as part of symbols] name of the parent set (for alias only), None otherwise
domain domain labels for the symbol
domain_type none, relaxed or regular depending on the symbol state
dim dimension
num_recs number of records in the symbol
cardinality Cartesian product of the domain information
sparsity 1 - num_recs/cardinality
Example #1
import gams.transfer as gt
m = gt.Container("trnsport.gdx")
In [1]: m.describeSets()
Out[1]:
name is_singleton domain domain_type dim num_recs cardinality sparsity
0 i False [*] none 1 2 None None
1 j False [*] none 1 3 None None
Example #2 – with aliases
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i" + str(i) for i in range(1, 10)])
j = gt.Set(m, "j", records=["j" + str(i) for i in range(1, 10)])
ip = gt.Alias(m, "ip", i)
jp = gt.Alias(m, "jp", j)
In [1]: m.describeSets()
Out[1]:
name is_singleton domain domain_type dim num_recs cardinality sparsity
0 i False [*] none 1 9 None None
1 j False [*] none 1 9 None None
In [2]: m.describeSets(m.listSets() + m.listAliases())
Out[2]:
name is_singleton is_alias alias_with domain domain_type dim num_recs cardinality sparsity
0 i False False None [*] none 1 9 None None
1 ip False True i [*] none 1 9 None None
2 j False False None [*] none 1 9 None None
3 jp False True j [*] none 1 9 None None

describeParameters

Argument Type Description Required Default
symbols list, str, NoneType A list of parameters in the Container to include in the output No None (if None specified, will assume all parameters)

Returns: pandas.DataFrame

The following table includes a short description of the column headings in the return.

Property / Statistic Description
name name of the symbol
is_scalar bool if the symbol is a scalar (i.e., dimension = 0)
domain domain labels for the symbol
domain_type none, relaxed or regular depending on the symbol state
dim dimension
num_recs number of records in the symbol
min_value min value in data
mean_value mean value in data
max_value max value in data
where_min domain of min value (if multiple, returns only first occurrence)
where_max domain of max value (if multiple, returns only first occurrence)
count_eps number of SpecialValues.EPS in data
count_na number of SpecialValues.NA in data
count_undef number of SpecialValues.UNDEF in data
cardinality Cartesian product of the domain information
sparsity 1 - num_recs/cardinality
Example
import gams.transfer as gt
m = gt.Container("trnsport.gdx")
In [1]: m.describeParameters()
Out[1]:
name is_scalar domain domain_type dim num_recs min_value mean_value max_value where_min where_max count_eps count_na count_undef cardinality sparsity
0 a False [i] regular 1 2 350.000 475.000 600.000 [seattle] [san-diego] 0 0 0 2 0.0
1 b False [j] regular 1 3 275.000 300.000 325.000 [topeka] [new-york] 0 0 0 3 0.0
2 c False [i, j] regular 2 6 0.126 0.176 0.225 [san-diego, topeka] [seattle, new-york] 0 0 0 6 0.0
3 d False [i, j] regular 2 6 1.400 1.950 2.500 [san-diego, topeka] [seattle, new-york] 0 0 0 6 0.0
4 f True [] none 0 1 90.000 90.000 90.000 None None 0 0 0 None None

describeVariables

Argument Type Description Required Default
symbols list, str, NoneType A list of variables in the Container to include in the output No None (if None specified, will assume all variables)

Returns: pandas.DataFrame

The following table includes a short description of the column headings in the return.

Property / Statistic Description
name name of the symbol
type type of variable (i.e., binary, integer, positive, negative, free, sos1, sos2, semicont, semiint)
domain domain labels for the symbol
domain_type none, relaxed or regular depending on the symbol state
dim dimension
num_recs number of records in the symbol
cardinality Cartesian product of the domain information
sparsity 1 - num_recs/cardinality
min_level min value in the level
mean_level mean value in the level
max_level max value in the level
where_max_abs_level domain of max(abs(level)) in data
count_eps_level number of SpecialValues.EPS in level
min_marginal min value in the marginal
mean_marginal mean value in the marginal
max_marginal max value in the marginal
where_max_abs_marginal domain of max(abs(marginal)) in data
count_eps_marginal number of SpecialValues.EPS in marginal
Example
import gams.transfer as gt
m = gt.Container("trnsport.gdx")
In [1]: m.describeVariables()
Out[1]:
name type domain domain_type dim num_recs cardinality sparsity min_level mean_level max_level where_max_abs_level count_eps_level min_marginal mean_marginal max_marginal where_max_abs_marginal count_eps_marginal
0 x positive [i, j] regular 2 6 6 0.0 0.000 150.000 300.000 [seattle, chicago] 0 0.0 0.008 0.036 [seattle, topeka] 0
1 z free [] none 0 1 None None 153.675 153.675 153.675 None 0 0.0 0.000 0.000 None 0

describeEquations

Argument Type Description Required Default
symbols list, str, NoneType A list of equations in the Container to include in the output No None (if None specified, will assume all equations)

Returns: pandas.DataFrame

The following table includes a short description of the column headings in the return.

Property / Statistic Description
name name of the symbol
type type of variable (i.e., binary, integer, positive, negative, free, sos1, sos2, semicont, semiint)
domain domain labels for the symbol
domain_type none, relaxed or regular depending on the symbol state
dim dimension
num_recs number of records in the symbol
cardinality Cartesian product of the domain information
sparsity 1 - num_recs/cardinality
min_level min value in the level
mean_level mean value in the level
max_level max value in the level
where_max_abs_level domain of max(abs(level)) in data
count_eps_level number of SpecialValues.EPS in level
min_marginal min value in the marginal
mean_marginal mean value in the marginal
max_marginal max value in the marginal
where_max_abs_marginal domain of max(abs(marginal)) in data
count_eps_marginal number of SpecialValues.EPS in marginal
Example
import gams.transfer as gt
m = gt.Container("trnsport.gdx")
In [1]: m.describeEquations()
Out[1]:
name type domain domain_type dim num_recs cardinality sparsity min_level mean_level max_level where_max_abs_level count_eps_level min_marginal mean_marginal max_marginal where_max_abs_marginal count_eps_marginal
0 cost eq [] none 0 1 None None -0.0 0.0 -0.0 None 1 1.000 1.000 1.000 None 0
1 demand geq [j] regular 1 3 3 0.0 275.0 300.0 325.0 [new-york] 0 0.126 0.168 0.225 [new-york] 0
2 supply leq [i] regular 1 2 2 0.0 350.0 450.0 550.0 [san-diego] 0 0.000 0.000 0.000 [seattle] 1

describeAliases

Argument Type Description Required Default
symbols list, str, NoneType A list of alias (only) symbols in the Container to include in the output No None (if None specified, will assume all aliases – not sets)

Returns: pandas.DataFrame

The following table includes a short description of the column headings in the return. All data is referenced from the parent set that the alias is created from.

Property / Statistic Description
name name of the symbol
is_singleton bool if the set/alias is a singleton set (or an alias of a singleton set)
alias_with name of the parent set (for alias only), None otherwise
domain domain labels for the symbol
domain_type none, relaxed or regular depending on the symbol state
dim dimension
num_recs number of records in the symbol
cardinality Cartesian product of the domain information
sparsity 1 - num_recs/cardinality
Example
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i" + str(i) for i in range(5)])
j = gt.Set(m, "j", records=["j" + str(j) for j in range(10)])
ip = gt.Alias(m, "ip", i)
ipp = gt.Alias(m, "ipp", ip)
jp = gt.Alias(m, "jp", j)
In [1]: m.describeAliases()
Out[1]:
name alias_with is_singleton domain domain_type dim num_recs cardinality sparsity
0 ip i False [*] none 1 5 None None
1 ipp i False [*] none 1 5 None None
2 jp j False [*] none 1 10 None None

Matrix Generation

Transfer stores data in a "flat" format, that is, one record entry per DataFrame row. However, it is often necessary to convert this data format into a matrix format – Transfer enables users to do this with relative ease using the toDense and the toSparseCoo symbol methods. The toDense method will return a dense N-dimensional numpy array with each dimension corresponding to the GAMS symbol dimension; it is possible to output an array up to 20 dimensions (a GAMS limit). The toSparseCoo method will return the data in a sparse scipy COOrdinate format, which can then be efficiently converted into other sparse matrix formats.

Attention
Both the toDense and toSparseCoo methods do not transform the underlying DataFrame in any way, they only return the transformed data.
Note
toSparseCoo will only convert 2-dimensional data to the scipy COOrdinate format. A user interested in sparse data for an N-dimensional symbol will need to decide how to reshape the dense array in order to generate the 2D sparse format.
Attention
In order to use the toSparseCoo method the user will need to install the scipy package. Scipy is not provided with GMSPython.

Both the toDense and toSparseCoo method leverage the indexing that comes along with using categorical data types to store domain information. This means that linking symbols together (by passing symbol objects as domain information) impacts the size of the matrix. This is best demonstrated by a few examples.

Example (1D data w/o domain linking (i.e., a relaxed domain))
import gams.transfer as gt
m = gt.Container()
a = gt.Parameter(m, "a", "i", records=[("a", 1), ("c", 3)])
In [1]: a.records
Out[1]:
i_0 value
0 a 1.0
1 c 3.0
In [2]: a.toDense()
Out[2]: array([1., 3.])
In [3]: a.toSparseCoo()
Out[3]:
<1x2 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in COOrdinate format>

Note that the parameter a is not linked to another symbol, so when converting to a matrix, the indexing is referenced to the data structure in a.records. Defining a sparse parameter a over a set i allows us to extract information from the i domain and construct a very different dense matrix, as the following example shows:

Example (1D data w/ domain linking (i.e., a regular domain))
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["a", "b", "c", "d"])
a = gt.Parameter(m, "a", i, records=[("a", 1), ("c", 3)])
In [1]: i.records
Out[1]:
uni_0 element_text
0 a
1 b
2 c
3 d
In [2]: a.records
Out[2]:
i_0 value
0 a 1.0
1 c 3.0
In [3]: a.toDense()
Out[3]: array([1., 0., 3., 0.])
In [4]: a.toSparseCoo()
Out[4]:
<1x4 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in COOrdinate format>
Example (2D data w/ domain linking)
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["a", "b", "c", "d"])
a = gt.Parameter(m, "a", [i, i], records=[("a", "a", 1), ("c", "c", 3)])
In [1]: i.records
Out[1]:
uni_0 element_text
0 a
1 b
2 c
3 d
In [2]: a.records
Out[2]:
i_0 i_1 value
0 a a 1.0
1 c c 3.0
In [3]: a.toDense()
Out[3]:
array([[1., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 3., 0.],
[0., 0., 0., 0.]])
In [4]: a.toSparseCoo()
Out[4]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 2 stored elements in COOrdinate format>

The Universe Set

A Unique Element List (UEL) (aka the "universe" or "universe set") is an (i,s) pair where i is an identification number for a string s. GAMS uses UELs to efficiently store domain entries of a record by storing the UEL ID i of a domain entry instead of the actual string s. This avoids storing the same string multiple times. The concept of UELs also exists in Python/Pandas and is called a "categorical array". Transfer leverages these types in order to efficiently store strings and enable domain checking within the Python environment.

Each domain column in a DataFrame can be assigned a unique categorical type, the effect is that each symbol maintains its own list of UELs per dimension. It is possible to convert a categorical column to its ID number representation by using the categorical accessor x.records[<domain_column_label>].cat.codes; however, this type of data manipulation is not necessary within Transfer, but could be handy when debugging data.

Pandas offers the possibility to create categorical column types that are ordered or not; Transfer relies exclusively on ordered categorical data types (in order for a symbol to be valid it must have only ordered categories). By using ordered categories, Transfer will order the UEL such that elements appear in the order in which they appeared in the data (which is how GAMS defines the UEL). GAMSTransfer allows the user to reorder the UELs with the uel_priority argument in the .write() method.

Transfer does not actually keep track of the UEL separately from other symbols in the Container, it will be created internal to the .write() method and is based on the order in which data is added to the container. The user can access the current state of the UEL with the .getUELs() container method. For example, we set a two dimensional set:

import gams.transfer as gt
m = gt.Container()
j = gt.Set(m, "j", ["*", "*"], records=[("i" + str(n), "j" + str(n)) for n in range(2)])
In [1]: j.records
Out[1]:
uni_0 uni_1 element_text
0 i0 j0
1 i1 j1
In [2]: m.getUELs()
Out[2]: ['i0', 'i1', 'j0', 'j1']

Pandas also includes a number of methods that allow categories to be renamed, appended, etc. These methods may be useful for advanced users, but most users will probably find that modifying the original data structures and resetting the symbol records provides a simpler solution. The design of Transfer should enable the user to quickly move data back and forth, without worrying about the deeper mechanics of categorical data.

Customize the Universe Set

The concept of a universe set is fundamental to GAMS and has consequences in many areas of GAMS programming including the order of loop execution. For example:

set final_model_year / 2030 /;
set t "all model years" / 2022*2030 /;

singleton set my(t) "model solve year";


loop(t,
  my(t) = yes;
  display my;
  );

The loop will execute model solve year 2030 first because the UEL 2030 was defined in the set final_model_year before it was used again in the definition of set t. This could lead to some surprising behavior if model time periods are linked together. Many GAMS users would create a dummy set (perhaps the first line of their model file) that contained all the UELs that had a significant order tom combat this behavior. Transfer allows for full control (renaming as well as ordering) over the universe set through the *UELS methods, briefly described here:

Quick summary table of UELs functions

Method Brief Description
getUELs Gets the UELs in a over either a symbol dimension, the entire symbol or the entire container. Unused UELs do not show up in symbol data but will show up in the GAMS UEL list.
addUELs Adds UELS to a symbol dimension(s). This function does not have a container level implementation.
removeUELs Removes UELs from a symbol dimension, the entire symbol, the entire container (or just a subset of symbols). If a used UEL is removed the DataFrame record will show a NaN.
renameUELs Renames UELs in a symbol dimension, the entire symbol, the entire container (or just a subset of symbols). Very handy for harmonizing UEL labeling of data that might have originated from different sources.
reorderUELs Reorders UELs in a symbol dimension(s). This function does not have a container level implementation.
setUELs Sets UELs for a symbol dimension(s). Equivalent results could be obtained with a combination of renameUELs and reorderUELs, but this one call may have some performance advantage.

These tools are extremely useful when data is arriving at a model from a variety of data sources. We will describe each of these functions in detail and provide examples in the following sections.

Attention
GAMS is insensitive to trailing whitespaces, the *UELs methods will automatically trim any trailing whitespace when creating the new UELs.

getUELs Examples

getUELs is a method of all GAMS symbol classes as well as the Container class. This allows the user to retrieve (ordered) UELs from the entire container or just a specific symbol dimension. For example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
j = gt.Set(m, "j", i, records=["j1", "j2", "j3"])
a = gt.Parameter(m, "a", [i, j], records=[(f"i{i}", f"j{i}", i) for i in range(4)])
In [1]: i.getUELs()
Out[1]: ['i1', 'i2', 'i3']
In [2]: m.getUELs()
Out[2]: ['i1', 'i2', 'i3', 'j1', 'j2', 'j3', 'i0', 'j0']
In [3]: m.getUELs("j")
Out[3]: ['j1', 'j2', 'j3']

addUELs Examples

addUELs is a method of all GAMS symbol classes. This method allows the user to add in new UELs labels to a specific dimension of a symbol – the user can add UELs that do not exist in the symbol records. For example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
j = gt.Set(m, "j", i, records=["j1", "j2", "j3"])
a = gt.Parameter(m, "a", [i, j], records=[(f"i{i}", f"j{i}", i) for i in range(1,4)])
i.addUELs("ham")
a.addUELs("and", 0)
a.addUELs("cheese", 1)
In [1]: i.getUELs()
Out[1]: ['i1', 'i2', 'i3', 'ham']
In [2]: a.getUELs()
Out[2]: ['i1', 'i2', 'i3', 'and', 'j1', 'j2', 'j3', 'cheese']

In this example we have added three new (unused) UELs: ham, and, cheese. These three UELs will now appear in the GAMS universe set (accessible with m.getUELs()). The addition of unused UELs does not impact the validity of the symbols (i.e., unused UELs will not trigger domain violations).

removeUELs Examples

removeUELs is a method of all GAMS symbol classes as well as the Container class. As a result, this method allows the user to clean up unwanted or simply unused UELs in a symbol dimension(s), over several symbols, or over the entire container. The previous example added three unused UELs (ham, and, cheese), but now we want to remove these UELs in order to clean up the GAMS universe set. We can accomplish this several ways:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
j = gt.Set(m, "j", i, records=["j1", "j2", "j3"])
a = gt.Parameter(m, "a", [i, j], records=[(f"i{i}", f"j{i}", i) for i in range(1,4)])
i.addUELs("ham")
a.addUELs("and", 0)
a.addUELs("cheese", 1)
# remove symbol UELs explicitly by dimension
i.removeUELs("ham", 0)
a.removeUELs("and", 0)
a.removeUELs("cheese", 1)
# remove symbol UELs for the entire symbol
i.removeUELs("ham")
a.removeUELs(["and", "cheese"])
# remove ONLY unused UELs from each symbol, independently
i.removeUELs()
a.removeUELs()
# remove ONLY unused UELs from the entire container (all symbols)
m.removeUELs()

In all cases the resulting universe set will be:

In [1]: m.getUELs()
Out[1]: ['i1', 'i2', 'i3', 'j1', 'j2', 'j3']

If a user removes a UEL that appears in data, that data will be lost permanently. The domain label will be transformed into an NaN as seen in this example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
j = gt.Set(m, "j", i, records=["j1", "j2", "j3"])
a = gt.Parameter(m, "a", [i, j], records=[(f"i{i}", f"j{i}", i) for i in range(1,4)])
m.removeUELs("i1")
In [1]: i.records
Out[1]:
uni_0 element_text
0 NaN
1 i2
2 i3
In [2]: a.records
Out[2]:
i_0 j_1 value
0 NaN j1 1.0
1 i2 j2 2.0
2 i3 j3 3.0
Attention
A container cannot be written if there are NaN entries in any of the domain columns (in any symbol) – an Exception is raised if there are missing domain labels.

renameUELs Examples

renameUELs is a method of all GAMS symbol classes as well as the Container class. This method allows the user to rename UELs in a symbol dimension(s), over several symbols, or over the entire container. This particular method is very handy when attempting to harmonize labeling schemes between data structures that originated from different sources. For example:

m = gt.Container()
a = gt.Parameter(
m,
"a",
["*", "*"],
records=[("WI", "IL", 10), ("IL", "IN", 12.5), ("WI", "IN", 8.7)],
description="shipment quantities",
)
b = gt.Parameter(
m,
"b",
["*"],
records=[("wisconsin", 1.2), ("illinois", 1.7), ("indiana", 1.2)],
description="multipliers",
)

...results in the following records:

In [1]: a.records
Out[1]:
uni_0 uni_1 value
0 WI IL 10.0
1 IL IN 12.5
2 WI IN 8.7
In [2]: b.records
Out[2]:
uni_0 value
0 wisconsin 1.2
1 illinois 1.7
2 indiana 1.2

However, two different data sources were used to generate the parameters a and b – one data source used the uppercase postal abbreviation of the state name and the other source used a lowercase full state name as the unique identifier. With the following syntax the user would be able to harmonize to a mixed case postal code labeling scheme (without losing any of the original UEL ordering).

m.renameUELs(
{
"WI": "Wi",
"IL": "Il",
"IN": "In",
"wisconsin": "Wi",
"illinois": "Il",
"indiana": "In",
}
)

...results in the following records (and the universe set):

In [1]: a.records
Out[1]:
uni_0 uni_1 value
0 Wi Il 10.0
1 Il In 12.5
2 Wi In 8.7
In [2]: b.records
Out[2]:
uni_0 value
0 Wi 1.2
1 Il 1.7
2 In 1.2

The universe set will now be:

In [1]: m.getUELs()
Out[1]: ['Wi', 'Il', 'In']

It is possible that some data needs to be cleaned and multiple UELs need to be mapped to a single label (within a single dimension). This is not allowed under default behavior because Transfer assumes that the provided UELs are truly unique (logically and lexicographically) – however, it might be necessary recreate the underlying categorical object to combine n (previously unique) UELs into one to establish the necessary logical set links. For example:

m = gt.Container()
a = gt.Parameter(
m,
"a",
["*", "*"],
records=[("WISCONSIN", "iowa", 10), ("WI", "illinois", 12)],
)
In [1]: a.records
Out[1]:
uni_0 uni_1 value
0 WISCONSIN iowa 10.0
1 WI illinois 12.0

The records are unique for a, but logically, there might be a need to rename WI to WISCONSIN.

In [1]: a.renameUELs({"WI": "WISCONSIN"})
Out[1]: Exception: Could not rename UELs (categories) in `a` dimension `0`. Reason: Categorical categories must be unique

In order achieve the desired behavior it is necessary to pass allow_merge=True to renameUELs:

In [1]: a.renameUELs({"WI": "WISCONSIN"}, allow_merge=True)
In [2]: a.records
Out[2]:
uni_0 uni_1 value
0 WISCONSIN iowa 10.0
1 WISCONSIN illinois 12.0
In [3]: a.getUELs()
Out[3]: ['WISCONSIN', 'iowa', 'illinois']

reorderUELs Examples

reorderUELs is a method of all GAMS symbol classes. This method allows the user to reorder UELs of a specific symbol dimension – reorderUELs will not all any new UELs to be create nor can they be removed. For example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
j = gt.Set(m, "j", i, records=["j1", "j2", "j3"])
a = gt.Parameter(m, "a", [i, j], records=[(f"i{i}", f"j{i}", i) for i in range(1,4)])
In [1]: i.getUELs()
Out[1]: ['i1', 'i2', 'i3']
In [2]: m.getUELs()
Out[2]: ['i1', 'i2', 'i3', 'j1', 'j2', 'j3']

But perhaps we want to reorder the UELs i1, i2, i3 to i3, i2, i1.

In [1]: i.reorderUELs(['i3', 'i2', 'i1'])
In [2]: i.getUELs()
Out[2]: ['i3', 'i2', 'i1']
In [3]: i.records
Out[3]:
uni_0 element_text
0 i1
1 i2
2 i3
Note
This example does not change the indexing scheme of the Pandas DataFrame at all, it only changes the underlying integer numbering scheme for the categories. We can see this by looking at the Pandas codes:
In [1]: i.records["uni_0"].cat.codes
Out[1]:
0 2
1 1
2 0
dtype: int8

setUELs Examples

reorderUELs is a method of all GAMS symbol classes. This method allows the user to create new UELs, rename UELs, and reorder UELs all in one method. For example:

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])

A user could accomplish a UEL reorder operation with setUELs:

In [1]: i.setUELs(["i3", "i2", "i1"])
In [2]: i.getUELs()
Out[2]: ['i3', 'i2', 'i1']
In [3]: i.records
Out[3]:
uni_0 element_text
0 i1
1 i2
2 i3

A user could accomplish a UEL reorder + add UELs operation with setUELs:

In [1]: i.setUELs(["i3", "i2", "i1", "j1", "j2"])
In [2]: i.getUELs()
Out[2]: ['i3', 'i2', 'i1', 'j1', 'j2']
In [3]: i.records
Out[3]:
uni_0 element_text
0 i1
1 i2
2 i3
In [4]: i.records["uni_0"].cat.codes
Out[4]:
0 2
1 1
2 0
dtype: int8

A user could accomplish a UEL reorder + add + rename with setUELs:

In [1]: i.setUELs(["j3", "j2", "j1", "ham", "cheese"], rename=True)
In [2]: i.getUELs()
Out[2]: ['j3', 'j2', 'j1', 'ham', 'cheese']
In [3]: i.records
Out[3]:
uni_0 element_text
0 j3
1 j2
2 j1
In [4]: i.records["uni_0"].cat.codes
Out[4]:
0 0
1 1
2 2
dtype: int8
Note
This example does not change the indexing scheme of the Pandas DataFrame at all, but the rename=True flag means that the records will get updated just as if a renameUELs call had been made.

If a user wanted to set new UELs on top of this data, without renaming, they would need to be careful to include the current UELs in the UELs being set. It is possible to loose these labels if they are not included (which will prevent the data from being written to GDX/GMD).

m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2", "i3"])
i.setUELs(["j1", "i2", "j3", "ham", "cheese"])
In [38]: i.getUELs()
Out[38]: ['j1', 'i2', 'j3', 'ham', 'cheese']
In [39]: i.records
Out[39]:
uni_0 element_text
0 NaN
1 i2
2 NaN

Reordering Symbols

The order of the Container file requires the symbols to be sorted such that, for example, a Set used as domain of another symbol appears before that symbol. The Container will try to establish a valid ordering when writing the data. This type of situation could be encountered if the user is adding and removing many symbols (and perhaps rewriting symbols with the same name) – users should attempt to only add symbols to a Container once, and care must be taken when creating symbol names. The method reorderSymbols attempts to fix symbol ordering problems. The following example shows how this can occur:

Example Symbol reordering
import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i" + str(i) for i in range(5)])
j = gt.Set(m, "j", i, records=["i" + str(i) for i in range(3)])
In [1]: m.data
Out[1]:
{'i': <src.gamstransfer.Set at 0x7f90c068a8e0>,
'j': <src.gamstransfer.Set at 0x7f908084ceb0>}
# now we remove the set i and recreate the data
m.removeSymbols("i")
i = gt.Set(m, "i", records=["i" + str(i) for i in range(5)])

The symbols are now out of order in .data and must be reordered:

In [1]: m.data
Out[1]:
{'j': <src.gamstransfer.Set at 0x7f90c068a8e0>,
'i': <src.gamstransfer.Set at 0x7f908084ceb0>}
# calling reorderSymbols() will order the dictionary properly, but the domain reference in j is now broken
m.reorderSymbols()
# fix the domain reference in the set j
j.domain = i
In [1]: m.isValid()
Out[1]: True

Rename Symbols

It is possible to rename a symbol even after it has been added to a Container. There are two methods that can be used to achieve the desired outcome:

  • using the container method renameSymbol
  • directly changing the name symbol property

We create a Container with two sets:

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"])
j = gt.Set(m, "j", records=["new-york", "chicago", "topeka"])
Example #1 - Change the name of a symbol with the container method
In [1]: m.renameSymbol("i","h")
In [2]: m.data
Out[2]:
{'h': <src.gamstransfer.Set at 0x7f9fc01fc070>,
'j': <src.gamstransfer.Set at 0x7f9f9080a220>}
Example #2 - Change the name of a symbol with the .name attribute
In [1]: i.name = "h"
In [2]: m.data
Out[2]:
{'h': <src.gamstransfer.Set at 0x7f9fc01fc070>,
'j': <src.gamstransfer.Set at 0x7f9f9080a220>}
Note
Note that the renamed symbols maintain the original symbol order, this will prevent unnecessary reordering operations later in the workflow.

Removing Symbols

Removing symbols from a container is easy when using the removeSymbols container method; this method accepts either a str or a list of str.

Attention
Once a symbol has been removed, it is possible to have hanging references as domain links in other symbols. The user will need to repair these other symbols with the proper domain links in order to avoid validity errors.

GAMS Special Values

The GAMS system contains five special values: UNDEF (undefined), NA (not available), EPS (epsilon), +INF (positive infinity), -INF (negative infinity). These special values must be mapped to their Python equivalents. Transfer follows the following convention to generate the 1:1 mapping:

  • +INF is mapped to float("inf")
  • -INF is mapped to float("-inf")
  • EPS is mapped to -0.0 (mathematically identical to zero)
  • NA is mapped to a special NaN
  • UNDEF is mapped to float("nan")

Transfer syntax is designed to quickly get data into a form that is usable in further analyses or visualization; this mapping also highlights the preference for data that is of type float, which offers performance benefits within Pandas/NumPy. The user does not need to remember these constants as they are provided within the class SpecialValues as SpecialValues.POSINF, SpecialValues.NEGINF, SpecialValues.EPS, SpecialValues.NA, and SpecialValues.UNDEF. The SpecialValues class also contains methods to test for these special values. Some examples are shown below; already, we, begin to introduce some of the Transfer syntax.

Example (special values in a parameter)
import gams.transfer as gt
m = gt.Container()
x = gt.Parameter(
m,
"x",
["*"],
records=[
("i1", 1),
("i2", gt.SpecialValues.POSINF),
("i3", gt.SpecialValues.NEGINF),
("i4", gt.SpecialValues.EPS),
("i5", gt.SpecialValues.NA),
("i6", gt.SpecialValues.UNDEF),
],
description="special values",
)

The following DataFrame for x would look like:

In [1]: x.records
Out[1]:
uni_0 value
0 i1 1.0
1 i2 inf
2 i3 -inf
3 i4 -0.0
4 i5 NaN
5 i6 NaN

The user can now easily test for specific special values in the value column of the DataFrame (returns a boolean array):

In [1]: gt.SpecialValues.isNA(x.records["value"])
Out[1]: array([False, False, False, False, True, False])

Other data structures can be passed into these methods as long as these structures can be converted into a numpy array with dtype=float. It follows that:

In [1]: gt.SpecialValues.isEps(gt.SpecialValues.EPS)
Out[1]: True
In [2]: gt.SpecialValues.isPosInf(gt.SpecialValues.POSINF)
Out[2]: True
In [3]: gt.SpecialValues.isNegInf(gt.SpecialValues.NEGINF)
Out[3]: True
In [4]: gt.SpecialValues.isNA(gt.SpecialValues.NA)
Out[4]: True
In [5]: gt.SpecialValues.isUndef(gt.SpecialValues.UNDEF)
Out[5]: True
In [6]: gt.SpecialValues.isUndef(gt.SpecialValues.NA)
Out[6]: False
In [6]: gt.SpecialValues.isNA(gt.SpecialValues.UNDEF)
Out[6]: False

Pandas DataFrames allow data columns to exist with mixed type (dtype=object) – Transfer leverages this convenience feature to enable users to import string representations of EPS, NA, and UNDEF. Transfer is tolerant of any mixed-case special value string representation. Python offers additional flexibility when representing negative/positive infinity. Any string x where float(x) == float("inf") evaluates to True can be used to represent positive infinity. Similarly, any string x where float(x) == float("-inf") evaluates to True can be used to represent negative infinity. Allowed values include inf, +inf, INFINITY, +INFINITY, -inf, -INFINITY and all mixed-case equivalents.

Example (special values defined by strings)
import gams.transfer as gt
m = gt.Container()
x = gt.Parameter(
m,
"x",
["*"],
records=[
("i1", 1),
("i2", "+inf"),
("i3", "-infinity"),
("i4", "eps"),
("i5", "na"),
("i6", "undef"),
],
description="special values",
)

These special strings will be immediately mapped to their float equivalents from the SpecialValues class in order to ensure that all data entries are float types.

Standard Data Formats

This section is meant to introduce the standard format that Transfer expects for symbol records. It has already been mentioned that we store data as a Pandas DataFrame, but there is an assumed structure to the column headings and column types that will be important to understand. Transfer includes convenience functions in order to ease the burden of converting data from a user-centric format to one that is understood by Transfer. However, advanced users will want to convert their data first and add it directly to the Container to avoid making extra copies of (potentially large) data sets.

Set Records Standard Format

All set records (including singleton sets) are stored as a Pandas DataFrame with n number of columns, where n is the dimensionality of the symbol + 1. The first n-1 columns include the domain elements while the last column includes the set element explanatory text. Records are organized such that there is one record per row.

The names of the domain columns follow a pattern of <set_name>_<index_position>; a symbol dimension that is referenced to the universe is labeled uni_<index position>. The explanatory text column is called element_text and must take the last position in the DataFrame.

All domain columns must be a categorical data type and the element_text column must be a object type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however Transfer requires that all these are type str. All rows in the element_text column must be type str.

Some examples:

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"])
j = gt.Set(m, "j", [i, "*"], records=[("seattle", "new-york"), ("san-diego", "st-louis")])
k = gt.Set(m, "k", [i], is_singleton=True, records=["seattle"])
In [1]: i.records
Out[1]:
uni_0 element_text
0 seattle
1 san-diego
In [2]: j.records
Out[2]:
i_0 uni_1 element_text
0 seattle new-york
1 san-diego st-louis
In [3]: k.records
Out[3]:
i_0 element_text
0 seattle
Parameter Records Standard Format

All parameter records (including scalars) are stored as a Pandas DataFrame with n number of columns, where n is the dimensionality of the symbol + 1. The first n-1 columns include the domain elements while the last column includes the numerical value of the records. Records are organized such that there is one record per row. Scalar parameters have zero dimension, therefore they only have one column and one row.

The names of the domain columns follow a pattern of <set_name>_<index_position>; a symbol dimension that is referenced to the universe is labeled uni_<index_position>. The value column is called value and must take the last position in the DataFrame.

All domain columns must be a categorical data type and the value column must be a float type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however Transfer requires that all these are type str.

Some examples:

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"])
a = gt.Parameter(m, "a", ["*"], records=[("seattle", 50), ("san-diego", 100)])
b = gt.Parameter(
m,
"b",
[i, "*"],
records=[("seattle", "new-york", 32.2), ("san-diego", "st-louis", 123)],
)
c = gt.Parameter(m, "c", records=90)
In [1]: a.records
Out[1]:
uni_0 value
0 seattle 50.0
1 san-diego 100.0
In [2]: b.records
Out[2]:
i_0 uni_1 value
0 seattle new-york 32.2
1 san-diego st-louis 123.0
In [3]: c.records
Out[3]:
value
0 90.0
Variable/Equation Records Standard Format

Variables and equations share the same standard data format. All records (including scalar variables/equations) are stored as a Pandas DataFrame with n number of columns, where n is the dimensionality of the symbol + 5. The first n-5 columns include the domain elements while the last five columns include the numerical values for different attributes of the records. Records are organized such that there is one record per row. Scalar variables/equations have zero dimension, therefore they have five columns and one row.

The names of the domain columns follow a pattern of <set_name>_<index position>; a symbol dimension that is referenced to the universe is labeled uni_<index_position>. The attribute columns are called level, marginal, lower, upper, and scale. These attribute columns must appear in this order. Attributes that are not supplied by the user will be assigned the default GAMS values for that variable/equation type; it is possible to not pass any attributes, Transfer would then simply assign default values to all attributes.

All domain columns must be a categorical data type and all the attribute columns must be a float type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however Transfer requires that all these are type str.

Some examples:

import gams.transfer as gt
import pandas as pd
m = gt.Container()
i = gt.Set(m, "i", records=["seattle", "san-diego"])
a = gt.Variable(
m,
"a",
"free",
domain=[i],
records=pd.DataFrame(
[("seattle", 50), ("san-diego", 100)], columns=["city", "level"]
),
)
In [1]: a.records
Out[1]:
i_0 level marginal lower upper scale
0 seattle 50.0 0.0 -inf inf 1.0
1 san-diego 100.0 0.0 -inf inf 1.0

GDX Read/Write

Up until now, we have been focused on using Transfer to create symbols in an empty Container using the symbol constructors (or their corresponding container methods). These tools will enable users to ingest data from many different formats and add them to a Container – however, it is also possible to read in symbol data directly from GDX files using the read container method. In the following sections, we will discuss this method in detail as well as the write method, which allows users to write out to new GDX files.

Read GDX

There are two main ways to read in GDX based data.

  • Pass the file path directly to the Container constructor (will read all symbols and records)
  • Pass the file path directly to the read method (default read all symbols, but can read partial files)

The first option here is provided for convenience and will, internally, call the read method. This method will read in all symbols as well as their records. This is the easiest and fastest way to get data out of a GDX file and into your Python environment. For the following examples we leverage the GDX output generated from the `trnsport.gms` model file.

Example (reading full data w/ Container constructor)
import gams.transfer as gt
m = gt.Container("trnsport.gdx")
In [1]: m.data
Out[1]:
{'i': <src.gamstransfer.Set at 0x7fdd21858d60>,
'j': <src.gamstransfer.Set at 0x7fdd21858dc0>,
'a': <src.gamstransfer.Parameter at 0x7fdd21858df0>,
'b': <src.gamstransfer.Parameter at 0x7fdd21858d90>,
'd': <src.gamstransfer.Parameter at 0x7fdd21858e80>,
'f': <src.gamstransfer.Parameter at 0x7fdd21858eb0>,
'c': <src.gamstransfer.Parameter at 0x7fdd21858ee0>,
'x': <src.gamstransfer.Variable at 0x7fdd21858f10>,
'z': <src.gamstransfer.Variable at 0x7fdd21858e50>,
'cost': <src.gamstransfer.Equation at 0x7fdd21858f70>,
'supply': <src.gamstransfer.Equation at 0x7fdd21858fa0>,
'demand': <src.gamstransfer.Equation at 0x7fdd21858fd0>}
In [1]: m.describeParameters()
Out[1]:
name is_scalar domain domain_type dim num_recs min_value mean_value max_value where_min where_max count_eps count_na count_undef cardinality sparsity
0 a False [i] regular 1 2 350.000 475.000 600.000 [seattle] [san-diego] 0 0 0 2 0.0
1 b False [j] regular 1 3 275.000 300.000 325.000 [topeka] [new-york] 0 0 0 3 0.0
2 c False [i, j] regular 2 6 0.126 0.176 0.225 [san-diego, topeka] [seattle, new-york] 0 0 0 6 0.0
3 d False [i, j] regular 2 6 1.400 1.950 2.500 [san-diego, topeka] [seattle, new-york] 0 0 0 6 0.0
4 f True [] none 0 1 90.000 90.000 90.000 None None 0 0 0 None None

A user could also read in data with the read method as shown in the following example.

Example (reading full data w/ read method)
import gams.transfer as gt
m = gt.Container()
m.read("trnsport.gdx")
In [1]: m.data
Out[1]:
{'i': <src.gamstransfer.Set at 0x7fdd21858d60>,
'j': <src.gamstransfer.Set at 0x7fdd21858dc0>,
'a': <src.gamstransfer.Parameter at 0x7fdd21858df0>,
'b': <src.gamstransfer.Parameter at 0x7fdd21858d90>,
'd': <src.gamstransfer.Parameter at 0x7fdd21858e80>,
'f': <src.gamstransfer.Parameter at 0x7fdd21858eb0>,
'c': <src.gamstransfer.Parameter at 0x7fdd21858ee0>,
'x': <src.gamstransfer.Variable at 0x7fdd21858f10>,
'z': <src.gamstransfer.Variable at 0x7fdd21858e50>,
'cost': <src.gamstransfer.Equation at 0x7fdd21858f70>,
'supply': <src.gamstransfer.Equation at 0x7fdd21858fa0>,
'demand': <src.gamstransfer.Equation at 0x7fdd21858fd0>}

It is also possible to read in a partial GDX file with the read method, as shown in the following example:

m = gt.Container()
m.read("trnsport.gdx", "x")
In [1]: m.data
Out[1]: {'x': <src.gamstransfer.Variable at 0x7fa728a2d9d0>}
In [2]: m.data["x"].records
Out[2]:
i_0 j_1 level marginal lower upper scale
0 seattle new-york 50.0 0.000 0.0 inf 1.0
1 seattle chicago 300.0 0.000 0.0 inf 1.0
2 seattle topeka 0.0 0.036 0.0 inf 1.0
3 san-diego new-york 275.0 0.000 0.0 inf 1.0
4 san-diego chicago 0.0 0.009 0.0 inf 1.0
5 san-diego topeka 275.0 0.000 0.0 inf 1.0

This syntax assumes that the user will always want to read in both the metadata as well as the actual data records, but it is possible to skip the reading of the records by passing the argument records=False.

m = gt.Container()
m.read("trnsport.gdx", "x", records=False)
In [1]: m.data
Out[1]: {'x': <src.gamstransfer.Variable at 0x7fa728a37220>}
In [2]: m["x"].summary
Out[2]:
{'name': 'x',
'type': 'positive',
'domain_objects': ['i', 'j'],
'domain_names': ['i', 'j'],
'dimension': 2,
'description': 'shipment quantities in cases',
'number_records': None,
'domain_type': 'relaxed'}
In [3]: type(m["x"].records)
Out[3]: <class 'NoneType'>
Attention
The read method attempts to link the domain objects together (in order to have a "regular" domain_type) but if domain sets are not part of the read operation there is no choice but to default to a "relaxed" domain_type. This can be seen in the last example where we only read in the variable x and not the domain sets (i and j) that the variable is defined over. All the data will be available to the user, but domain checking is no longer possible. The symbol x will remain with "relaxed" domain type even if the user were to read in sets i and j in a second read call.

Write GDX

A user can write data to a GDX file by simply passing a file path (as a string). The write method will then create the GDX and write all data in the Container.

Example
m.write("path/to/file.gdx")
Example (write a compressed GDX file)
m.write("path/to/file.gdx", compress=True)

By default, all symbols in the Container will be written, however it is possible to write a subset of the symbols to a GDX file with the symbols argument. If a domain set is not included in the symbols list then the symbol will automatically be relaxed (but will retain the domain set's name as a string label – it does not get relaxed to *). This behavior can be seen in the following example.

import gams.transfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i1", "i2"])
a = gt.Parameter(
m,
"a",
[i, i],
records=[("i1", "i1", 10), ("i2", "i2", 12)],
)
m.write("out.gdx", "a")
# create a new container and read in the GDX
m2 = gt.Container("out.gdx")
# look at all the data
In [1]: m2.data
Out[1]: {'a': <src.gamstransfer.Parameter at 0x7f7d90080400>}
# notice that `a` has a relaxed domain type now
In [2]: m2["a"].domain_type
Out[2]: 'relaxed'
# `a` retains the labels from the original domain sets
In [3]: m2["a"].domain
Out[3]: ['i', 'i']
# The original container `m` retains its original state before writing
In [4]: m["a"].domain
Out[4]:
[<src.gamstransfer.Set at 0x7f7d90047df0>,
<src.gamstransfer.Set at 0x7f7d90047df0>]

In line 4 we can see that the auto-relaxation of the domain for a is only temporary for writing (in this case, from Container object m) and will be restored so as not to disturb the Container state.

Advanced users might want to specify an order to their UEL list (i.e., the universe set); recall that the UEL ordering follows that dictated by the data. As a convenience, it is possible to prepend the UEL list with a user specified order using the uel_priority argument.

Example (change the order of the UEL)
m = gt.Container()
i = gt.Set(m, "i", records=["a", "b", "c"])
m.write("foo.gdx", uel_priority=["a", "c"])

The original UEL order for this GDX file would have been ["a", "b", "c"], but this example reorders the UEL with uel_priority – the positions of b and c have been swapped. This can be verified with the gdxdump utility (using the uelTable argument):

gdxdump foo.gdx ueltable=foo

Set foo /
  'a' ,
  'c' ,
  'b' /;
$onEmpty

Set i(*) /
'a',
'c',
'b' /;

$offEmpty

GamsDatabase Read/Write

We have discussed how to create symbols in an empty Container and we have discussed how to exchange data with GDX files, however it is also possible to read and write data directly in memory by interacting with a GamsDatabase/GMD object – this allows Transfer to be used to read/write data within an Embedded Python Code environment or in combination with the Python OO API. There are some important differences when compared to data exchange with GDX since we are working with data representations in memory.

Read GamsDatabases

Just as with a GDX, there are two main ways to read in data that is in a GamsDatabase/GMD object.

  • Pass the GamsDatabase/GMD object directly to the Container constructor (will read all symbols and records)
  • Pass the GamsDatabase/GMD object directly to the read method (default read all symbols, but can read partial files)

The first option here is provided for convenience and will, internally, call the read method. This method will read in all symbols as well as their records. This is the easiest and fastest way to get data out of a GamsDatabase/GMD object and into your Python environment. While it is possible to generate a custom GamsDatabase/GMD object from scratch (using the gmdcc API), most users will be interacting with a GamsDatabase/GMD object that has already been instantiated internally when he/she is using Embedded Python Code or the GamsDatabase class in the Python OO API. Our examples will show how to access the GamsDatabase/GMD object – we leverage the some of the data from the `trnsport.gms` model file.

Example (reading full data w/ Container constructor)
m = gt.Container(gams.db)
Note
Embedded Python Code users will want pass the GamsDatabase object that is part of the GAMS Database object – this will always be referenced as gams.db regardless of the model file.

The following example uses embedded Python code to create a new Container, read in all symbols, and display some summary statistics as part of the gams log output.

Set
   i 'canning plants' / seattle,  san-diego /
   j 'markets'        / new-york, chicago, topeka /;

Parameter
   a(i) 'capacity of plant i in cases'
        / seattle    350
          san-diego  600 /

   b(j) 'demand at market j in cases'
        / new-york   325
          chicago    300
          topeka     275 /;

Table d(i,j) 'distance in thousands of miles'
              new-york  chicago  topeka
   seattle         2.5      1.7     1.8
   san-diego       2.5      1.8     1.4;

$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container(gams.db)
print(m.describeSets())

print(m.describeParameters())

$offEmbeddedCode

The gams log output will then look as such (the extra print calls are just providing nice spacing for this example):

GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- matrix.gms(29) 3 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
  name  is_singleton domain domain_type  dim  num_recs cardinality sparsity
0    i         False    [*]        none    1         2        None     None
1    j         False    [*]        none    1         3        None     None
  name  is_scalar  domain domain_type  dim  num_recs  min_value  mean_value  max_value            where_min            where_max  count_eps  count_na  count_undef  cardinality  sparsity
0    a      False     [i]     regular    1         2      350.0      475.00      600.0            [seattle]          [san-diego]          0         0            0            2       0.0
1    b      False     [j]     regular    1         3      275.0      300.00      325.0             [topeka]           [new-york]          0         0            0            3       0.0
2    d      False  [i, j]     regular    2         6        1.4        1.95        2.5  [san-diego, topeka]  [seattle, new-york]          0         0            0            6       0.0

--- Starting execution - empty program
*** Status: Normal completion

[3 rows x 16 columns]

--- Starting execution - empty program
*** Status: Normal completion

A user could also read in a subset of the data located in the GamsDatabase object with the read method as shown in the following example. Here we only read in the sets i and j, as a result the .describeParameters() method will return None.

Example (reading subset of full data w/ read method)
Set
   i 'canning plants' / seattle,  san-diego /
   j 'markets'        / new-york, chicago, topeka /;

Parameter
   a(i) 'capacity of plant i in cases'
        / seattle    350
          san-diego  600 /

   b(j) 'demand at market j in cases'
        / new-york   325
          chicago    300
          topeka     275 /;

Table d(i,j) 'distance in thousands of miles'
              new-york  chicago  topeka
   seattle         2.5      1.7     1.8
   san-diego       2.5      1.8     1.4;

$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container()
m.read(gams.db, symbols=["i","j"])
gams.printLog("")
print(m.describeSets())
print(m.describeParameters())

$offEmbeddedCode
GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- matrix.gms(29) 3 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---   name  is_singleton domain domain_type  dim  num_recs cardinality sparsity
0    i         False    [*]        none    1         2        None     None
1    j         False    [*]        none    1         3        None     None
None

--- Starting execution - empty program
*** Status: Normal completion

All the typical functionality of the Container exists when working with GamsDatabase/GMD objects. This means that domain linking, matrix conversion, and other more advanced options are available to the user at either compilation time or execution time (depending on the Embedded Code syntax being used, see: Syntax). The next example generates a 1000x1000 matrix and then takes its inverse using the Numpy linalg package.

Example (Matrix Generation and Inversion)
set i / i1*i1000 /;
alias(i,j);

parameter a(i,j);
a(i,j) = 1 / (ord(i)+ord(j) - 1);
a(i,i) = 1;


embeddedCode Python:
import gams.transfer as gt
import numpy as np
import time

gams.printLog("")
s = time.time()
m = gt.Container(gams.db)
gams.printLog(f"read data: {round(time.time() - s, 3)} sec")

s = time.time()
A = m["a"].toDense()
gams.printLog(f"create matrix A: {round(time.time() - s, 3)} sec")

s = time.time()
invA = np.linalg.inv(A)
gams.printLog(f"generate inv(A): {round(time.time() - s, 3)} sec")

endEmbeddedCode
Note
In this example, the assignment of the a parameter is done during execution time so we must use the execution time syntax for embedded code in order to get the numerical records properly.
GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- test.gms(27) 3 Mb
--- Starting execution: elapsed 0:00:00.003
--- test.gms(9) 36 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---
--- read data: 1.1 sec
--- create matrix A: 0.02 sec
--- generate inv(A): 0.031 sec
*** Status: Normal completion

We will extend this example in the next section to write the inverse matrix A back into a GAMS parameter.

Write to GamsDatabases

A user can write to a GamsDatabase/GMD object with the .write() method just as he/she would write a GDX file – however there are some important differences. When a user writes a GDX file the entire GDX file represents a complete data environment (all domains have been resolved, etc.) thus, Transfer does not need to worry about merge/replace operations. It is possible to merge/replace symbol records when a user is writing data to in-memory data representations with GamsDatabase/GMD. We show a few examples to illustrate this behavior.

Example (Populating a set in GAMS)
* note that we need to declare the set i over "*" in order to provide hints about the symbol dimensionality
set i(*);

$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db)

$offEmbeddedCode i


embeddedCode Python:
import gams.transfer as gt

m = gt.Container(gams.db)
gams.printLog("")
print(m["i"].records)

endEmbeddedCode
Note
In general, it is possible to use Transfer to create new symbols in a GamsDatabase and GMD object (and not necessarily merge symbols) but embedded code best practices necessitate the declaration of any GAMS symbols on the GAMS side first, then the records can be filled with Transfer.

If we break down this example we can see that the set i is declared within GAMS (with no records) and then the records for i are set by writing a Container to the gams.db GamsDatabase object (we do this at compile time). The second embedded Python code block runs at execution time and is simply there to read all the records on the set i – printing the sets this way adds the output to the .log file (we could also use the more common display i; operation in GAMS to display the set elements in the LST file).

GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- test.gms(10) 2 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
--- test.gms(20) 3 Mb
--- Starting execution: elapsed 0:00:01.464
--- test.gms(13) 4 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---   uni_0 element_text
0    i0
1    i1
2    i2
3    i3
4    i4
5    i5
6    i6
7    i7
8    i8
9    i9

*** Status: Normal completion
Example (Merging set records)
set i / i1, i2 /;

$onmulti
$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db, merge_symbols="i")

$offEmbeddedCode i
$offmulti

embeddedCode Python:
import gams.transfer as gt

m = gt.Container(gams.db)
gams.printLog("")
print(m["i"].records)

endEmbeddedCode

In this example we need to make use of $onMulti/$offMulti in order to merge new set elements into the the set i (the same would be true if we were merging other symbol types) – any symbol that already has records defined (in GAMS) and is being added to with Python (and Transfer) must be wrapped with $onMulti/$offMulti. As with the previous example, the second embedded Python code block runs at execution time and is simply there to read all the records on the set i. Note that the UEL order will be different in this case (i1 and i2 come before i0).

GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- test.gms(11) 3 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
--- test.gms(21) 3 Mb
--- Starting execution: elapsed 0:00:01.535
--- test.gms(14) 4 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---   uni_0 element_text
0    i1
1    i2
2    i0
3    i3
4    i4
5    i5
6    i6
7    i7
8    i8
9    i9

*** Status: Normal completion
Example (Replacing set records)
set i / x1, x2 /;

$onmultiR
$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db)

$offEmbeddedCode i
$offmulti

embeddedCode Python:
import gams.transfer as gt

m = gt.Container(gams.db)
gams.printLog("")
print(m["i"].records)

endEmbeddedCode

In this example we want to replace the x1 and x2 set elements and built up a totally new element list with set elements from the Container. Instead of $onMulti/$offMulti we must use $onMultiR/$offMulti to ensure that the replacement happens in GAMS; we also need to remove the set i from the merge_symbols argument.

Attention
If the user seeks to replace all records in a symbol they must use the $onMultiR syntax. It is not sufficient to simply remove them from the merge_symbols argument in Transfer. If the user mistakenly uses $onMulti the symbols will end up merging without total replacement.
GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- test.gms(11) 3 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
--- test.gms(21) 3 Mb
--- Starting execution: elapsed 0:00:01.482
--- test.gms(14) 4 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---   uni_0 element_text
0    i0
1    i1
2    i2
3    i3
4    i4
5    i5
6    i6
7    i7
8    i8
9    i9

*** Status: Normal completion
Example (Merging parameter records)
set i;
parameter a(i<) /
i1 1.23
i2 5
/;

$onmulti
$onembeddedCode Python:
import gams.transfer as gt

m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
a = gt.Parameter(m, "a", domain=i, records=[("i"+str(i),i) for i in range(10)])
m.write(gams.db, merge_symbols="a")

$offEmbeddedCode i, a
$offmulti

embeddedCode Python:
import gams.transfer as gt

m = gt.Container(gams.db)
gams.printLog("")
print(m["a"].records)
endEmbeddedCode

In this example we also need to make use of $onMulti/$offMulti in order to merge new set elements into the the set i, however the set i also needs to contain the elements that are defined in the parameter – here we make use of the < operator that will add the set elements from a(i) into the set i

Note
It would also be possible to run this example by explicitly defining the set i /i1, i2/; before the parameter declaration.
Attention
Transfer will overwrite all duplicate records when merging. The original values of a("i1") and a("i2") have been replaced with their new values when writing the Container in this example (see output below).
GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- test.gms(16) 3 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
--- test.gms(25) 3 Mb
--- Starting execution: elapsed 0:00:01.467
--- test.gms(19) 4 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---   i_0  value
0  i1    1.0
1  i2    2.0
2  i3    3.0
3  i4    4.0
4  i5    5.0
5  i6    6.0
6  i7    7.0
7  i8    8.0
8  i9    9.0

*** Status: Normal completion
Example (Advanced Matrix Generation and Inversion w/ Write Operation)
set i / i1*i1000 /;
alias(i,j);

parameter a(i,j);
a(i,j) = 1 / (ord(i)+ord(j) - 1);
a(i,i) = 1;

parameter inv_a(i,j);
parameter ident(i,j);

embeddedCode Python:
import gams.transfer as gt
import numpy as np
import time

gams.printLog("")
gams.printLog("")

s = time.time()
m = gt.Container(gams.db)
gams.printLog(f"read data: {round(time.time() - s, 3)} sec")

s = time.time()
A = m["a"].toDense()
gams.printLog(f"create matrix A: {round(time.time() - s, 3)} sec")

s = time.time()
invA = np.linalg.inv(A)
gams.printLog(f"calculate inv(A): {round(time.time() - s, 3)} sec")

s = time.time()
m["inv_a"].setRecords(invA)
gams.printLog(f"convert matrix to records for inv(A): {round(time.time() - s, 3)} sec")

s = time.time()
I = np.dot(A,invA)
tol = 1e-9
I[np.where((I<tol) & (I>-tol))] = 0
gams.printLog(f"calculate A*invA + small number cleanup: {round(time.time() - s, 3)} sec")

s = time.time()
m["ident"].setRecords(I)
gams.printLog(f"convert matrix to records for I: {round(time.time() - s, 3)} sec")

s = time.time()
m.write(gams.db, ["inv_a","ident"])
gams.printLog(f"write to GamsDatabase: {round(time.time() - s, 3)} sec")

gams.printLog("")
endEmbeddedCode inv_a, ident

display ident;

In this example we extend the example shown in Read GamsDatabases to read data from GAMS, calculate a matrix inversion, do the matrix multiplication, and then write both the A^-1 and A*A^-1 (i.e., the identity matrix) back to GAMS for display in the LST file. This data round trip highlights the benefits of using a Transfer Container (and the linked symbol structure) as the mechanism to move data – converting back and forth from a records format to a matrix format can be cumbersome, but here, Transfer takes care of all the indexing for the user.

The first few lines of GAMS code generates a 1000x1000 A matrix as a parameter (at execution time), we then define two more parameters that we will fill with results of the embedded Python code – specifically we want to fill a parameter with the matrix A^-1 and we want to verify that another parameter (ident) contains the identity matrix (i.e., I). Stepping through the code:

  1. We start the embedded Python code section (execution time) by importing both Transfer and Numpy and by reading all the symbols that currently exist in the GamsDatabase. We must read in all this information in order to get the domain set information – Transfer needs these domain sets in order to generate matricies with the proper size.
  2. Generate the matrix A by calling .toDense() on the symbol object in the Container.
  3. Take the inverse of A with np.linalg.inv().
  4. The Parameter symbol for inv_a already exists in the Container, but it does not have any records (i.e., m["inv_a"].records is None will evaluate to True). We use .setRecords() to convert the invA back into a records format.
  5. We continue the computations by performing the matrix multiplication using np.dot() – we must clean up a lot of small numbers in I.
  6. The Parameter symbol for ident already exists in the Container, but it does not have any records. We use .setRecords() to convert I back into a records format.
  7. Since we are calculating these parameter values at execution time, it is not possible to modify the domain set information (or even merge/replace it). Therefore we only want to write the parameter values to GAMS. We achieve this by writing a subset of the Container symbols out with the m.write(gams.db, ["inv_a","ident"]) call. This partial write preserves symbol validity in the Container and it does not violate other GAMS requirements.
  8. Finally, we can verify that the (albeit large) identity matrix exists in the LST file (or in another GDX file).
Note
It was not possible to just use np.round because small negative numbers that round to -0.0 will be interpreted by Transfer as the GAMS EPS special value.

The output for this example is shown below:

GAMS 38.1.0   Copyright (C) 1987-2022 GAMS Development. All rights reserved
--- Starting compilation
--- matrix.gms(52) 3 Mb
--- Starting execution: elapsed 0:00:00.004
--- matrix.gms(11) 36 Mb
--- Initialize embedded library libembpycclib64.dylib
--- Execute embedded library libembpycclib64.dylib
---
---
--- read data: 1.083 sec
--- create matrix A: 0.016 sec
--- calculate inv(A): 0.032 sec
--- convert matrix to records for inv(A): 0.176 sec
--- calculate A*invA + small number cleanup: 0.027 sec
--- convert matrix to records for I: 0.17 sec
--- write to GamsDatabase: 1.937 sec
---
--- matrix.gms(52) 68 Mb
*** Status: Normal completion