Table of Contents
- Install
- Design
- Quick Start
- Create a Container
- Create a Set
- Create a Parameter
- Create a Variable
- Create an Equation
- Create an Alias
- Validating Data
- Domain Forwarding
- Describing Data
- Matrix Generation
- The Universe Set
- Reordering Symbols
- Rename Symbols
- Removing Symbols
- Full Example
- Advanced Examples
- GAMS Special Values
- GAMS Transfer Standard Data Formats
- Data Exchange with GDX
- Data Exchange with GamsDatabase and GMD Objects (Embedded Python Code)
- ConstContainer (Rapid Read)
GAMS Transfer is a package to maintain GAMS data outside a GAMS script in a programming language like Python or Matlab. It allows the user to add GAMS symbols (Sets, Aliases, Parameters, Variables and Equations), to manipulate GAMS symbols, as well as read/write symbols to different data endpoints. GAMS Transfer’s main focus is the highly efficient transfer of data between GAMS and the target programming language, while keeping those operations as simple as possible for the user. In order to achieve this, symbol records – the actual and potentially large-scale data sets – are stored in native data structures of the corresponding programming languages. The benefits of this approach are threefold: (1) The user is usually very familiar with these data structures, (2) these data structures come with a large tool box for various data operations, and (3) optimized methods for reading from and writing to GAMS can transfer the data as a bulk – resulting in the high performance of this package. This documentation describes, in detail, the use of GAMS Transfer within a Python environment.
Data within GAMS Transfer will be stored as Pandas DataFrame. The flexible nature of Pandas DataFrames makes them ideal for storing/manipulating sparse data. Pandas includes advanced operations for indexing and slicing, reshaping, merging and even visualization.
Pandas also includes a number of advanced data I/O tools that allow users to generate DataFrames directly from CSV (.csv
), JSON (.json
), HTML (.html
), Microsoft Excel (.xls
, .xlsx
), SQL , pickle (.pkl
), SPSS (.sav
, .zsav
), SAS (.xpt
, .sas7bdat
), etc.
Centering GAMS Transfer around the Pandas DataFrame gives GAMS users (on a variety of platforms – MacOS, Windows, Linux) access to tools to move data back and forth between their favorite environments for use in their GAMS models.
The goal of this documentation is to introduce the user to GAMS Transfer and its functionality. This documentation is not designed to teach the user how to effectively manipulate Pandas DataFrames; users seeking a deeper understanding of Pandas are referred to the extensive documentation.
Install
The user must download and install the latest version of GAMS in order to install GAMS Transfer. GAMS Transfer is installed when the GAMS Python API is built and installed. The user is referred HERE for instructions on how to install the Python API files. GAMS Transfer and all GAMS Python API files are compatible with environment managers such as Anaconda.
Design
Storing, manipulating, and transforming sparse data requires that it lives within an environment – this data can then be linked together to enable various operations. In GAMS Transfer we refer to this "environment" as the Container
, it is the main repository for storing and linking our sparse data. Symbols can be added to the Container
from a variety of GAMS starting points but they can also be generated directly within the Python environment using convenient function calls that are part of the GAMS Transfer package; a symbol can only belong to one container at a time.
The process of linking symbols together within a container was inspired by typical GAMS workflows but leverages aspects of object oriented programming to make linking data a natural process. Linking data enables data operations like implicit set growth, domain checking, data format transformations (to dense/sparse matrix formats), etc – all of these features are enabled by the use of ordered pandas.CategoricalDtype data types. All of these details will be discussed in the following sections.
Naming Conventions
Methods – functions that operate on a object – are all verbs (i.e., getMaxAbsValue()
, getUniverseSet()
, etc.) and use camel case for identification purposes. Methods are, by convention, tools that "do things"; that is they involve some, potentially expensive, computations. Some GAMS Transfer methods accept arguments, while others are simply called using the ()
notation. Plural arguments (columns
) hint that they can accept lists of inputs (i.e., a list of symbol names) while singular arguments (column
) will only accept one input at a time.
Properties – inherent attributes of an object – are all nouns (i.e., name
, number_records
, etc.) and use snake case (lower case words separated by underscores) for identification purposes. Object properties (or "object attributes") are fundamental to the object and therefore they are not called like methods; object properties are simply accessed by other methods or user calls. By convention, properties only require trival amounts of computation to access.
Classes – the basic structure of an object – are all singular nouns and use camel case (starting with a capital first letter) for identification purposes.
Quick Start
GDX Read
Reading in all symbols can be accomplished with one line of code (we reference data from the `trnsport.gms` example).
All symbol data is organized in the data attribute (a Python dict) – m.data[<symbol_name>].records
– records are stored as Pandas DataFrames.
Write Symbol to CSV
Writing symbol records to a CSV can also be accomplished with one line.
Write a New GDX
There are five symbol classes within GAMS Transfer: 1) Sets, 2) Parameters, 3) Variables, 4) Equations and 5) Aliases. For purposes of this quick start, we show how to recreate the distance
data structure from the `trnsport.gms` model (the parameter d
). This brief example shows how users can achieve "GAMS-like" functionality, but within a Python environment – GAMS Transfer leverages the object oriented programming to simplify syntax.
This example shows a few fundamental features of GAMS Transfer:
- An empty Container is analogous to an empty GDX file
- Symbols will always be linked to a Container (notice that we always pass the Container reference
m
to the symbol constructor) - Records can be added to a symbol with the
setRecords()
method or through therecords
constructor argument (internally callssetRecords()
). GAMS Transfer will convert many common Python data structures into a standard format. - Domain linking is possible by passing domain set objects to other symbols – this will also enable domain checking (violations will show up as
NaN
) - Writing a GDX file can be accomplished in one line with the
write()
method.
This Quick Start introduced the reader to the GAMS Transfer syntax, but in the remaining sections we will present details about the core functionality and dig further into the syntax. Specifically, we will discuss:
- How to create a
Container
- How to add symbols to a
Container
- How to validate the data that is in the
Container
- How to defined domains implicitly with
domain_forwarding
- How to describe the data that is in the
Container
- How to transform data that is in the
Container
- Understand what the universe set is (UEL list)
- How to reorder symbols in the
Container
- How to remove symbols from the
Container
Create a Container
The main object class within GAMS Transfer is called Container
. The Container
is the vessel that allows symbols to be linked together (through their domain definitions), it enables implicit set definitions, it enables structural manipulations of the data (matrix generation), and it allows the user to perform different read/write operations.
Container constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
load_from | str , GMD Object Handle, GamsDatabase Object, ConstContainer | Points to the source of the data being read into the Container | No | None |
system_directory | str | Absolute path to GAMS system directory | No | Attempts to find the GAMS installation by creating a GamsWorkspace object and loading the system_directory attribute. |
Creating a Container
is a simple matter of initializing an object. For example:
This new Container
object, here called m
, contains a number of convenient properties and methods that allow the user to interact with the symbols that are in the Container
. Some of these methods are used to filter out different types of symbols, other methods are used to numerically characterize the data within each symbol.
Container properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
data | main dict that is used to store all symbol data | dict | - |
Symbols are organized in the Container
under the data
Container
attribute. The dot notation (m.data
) is used to access the underlying dict
; symbols in this dict
can then be retrieved with the standard bracket notation (m.data[<symbol_name>]
).
Container methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
addAlias | Container method to add an Alias | name (str ) alias_with (Set ,Alias ) | Alias object |
addEquation | Container method to add an Equation | name (str ) type (str ) domain=[] (str ,list ) records=None (pandas.DataFrame ,numpy.ndarry ,None ) domain_forwarding=False (bool ) description="" (str ) | Equation object |
addParameter | Container method to add a Parameter | name (str ) domain=None (str ,list ,None ) records=None (pandas.DataFrame ,numpy.ndarry ,None ) domain_forwarding=False (bool ) description="" (str ) | Parameter object |
addSet | Container method to add a Set | name (str ) domain=None (str ,list ,None ) is_singleton=False (bool ) records=None (pandas.DataFrame ,numpy.ndarry ,None ) domain_forwarding=False (bool ) description="" (str ) | Set object |
addVariable | Container method to add an Variable | name (str ) type="free" (str ) domain=[] (str ,list ) records=None (pandas.DataFrame ,numpy.ndarry ,None ) domain_forwarding=False (bool ) description="" (str ) | Variable object |
describeAliases | create a summary table with descriptive statistics for Aliases | symbols=None (None ,str ,list ) - if None , assumes all aliases | pandas.DataFrame |
describeParameters | create a summary table with descriptive statistics for Parameters | symbols=None (None ,str ,list ) - if None , assumes all parameters | pandas.DataFrame |
describEquations | create a summary table with descriptive statistics for Equations | symbols=None (None ,str ,list ) - if None , assumes all equations | pandas.DataFrame |
describeSets | create a summary table with descriptive statistics for Sets | symbols=None (None ,str ,list ) - if None , assumes all sets | pandas.DataFrame |
describeVariables | create a summary table with descriptive statistics for Variables | symbols=None (None ,str ,list ) - if None , assumes all variables | pandas.DataFrame |
getSymbols | returns a list of object refernces for symbols | symbols (str ,list ) | list |
getUniverseSet | provides a universe for all symbols , the symbols argument allows GAMS Transfer to create a partial universe if writing only a subset of symbols (currently only supported when writing to GamsDatabases or GMD Objects) | symbols=None (None ,str ,list ) | list |
isValid | True if all symbols in the Container are valid | - | bool |
listAliases | list all aliases (is_valid=None ), list all valid aliases (is_valid=True ), list all invalid aliases (is_valid=False ) in the container | is_valid=None (bool ,None ) | list |
listEquations | list all equations (is_valid=None ), list all valid equations (is_valid=True ), list all invalid equations (is_valid=False ) in the container | is_valid=None (bool ,None ) types=None (list of equation types) - if None , assumes all types | list |
listParameters | list all parameters (is_valid=None ), list all valid parameters (is_valid=True ), list all invalid parameters (is_valid=False ) in the container | is_valid=None (bool ,None ) | list |
listSets | list all sets (is_valid=None ), list all valid sets (is_valid=True ), list all invalid sets (is_valid=False ) in the container | is_valid=None (bool ,None ) | list |
listSymbols | list all symbols (is_valid=None ), list all valid symbols (is_valid=True ), list all invalid symbols (is_valid=False ) in the container | is_valid=None (bool ,None ) | list |
listVariables | list all variables (is_valid=None ), list all valid variables (is_valid=True ), list all invalid variables (is_valid=False ) in the container | is_valid=None (bool ,None ) types=None (list of variable types) - if None , assumes all types | list |
read | main method to read load_from , can be provided with a list of symbols to read in subsets, records controls if symbol records are loaded or just metadata | load_from (str ,GMD Object Handle,GamsDatabase Object,ConstContainer ) symbols="all" (str , list ) records=True (bool ) | None |
removeSymbols | symbols to remove from the Container, also sets the symbols ref_container to None | symbols (str ,list ) | None |
renameSymbol | rename a symbol in the Container | old_name (str ), new_name (str ) | None |
reorderSymbols | reorder symbols in order to avoid domain violations | - | None |
write | main bulk write method to a write_to target | write_to (str ,GamsDatabase ,GMD Object) write_symbols=None (None ,str ,list ) - if None , assumes all symbols compress=False (bool ) uel_priority=None (str ,list ) merge_symbols=None (None ,str ,list ) | None |
Create a Set
There are two different ways to create a GAMS set and add it to a Container
.
- Use
Set
constructor - Use the
Container
methodaddSet
(which internally calls theSet
constructor)
Set Constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
container | Container | A reference to the Container object that the symbol is being added to | Yes | - |
name | str | Name of symbol | Yes | - |
domain | list | List of domains given either as string ('*' for universe set) or as reference to a Set object | No | ["*"] |
is_singleton | bool | Indicates if set is a singleton set (True ) or not (False ) | No | False |
records | many | Symbol records | No | None |
domain_forwarding | bool | Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | No | False |
description | str | Description of symbol | No | "" |
Set Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
is_singleton | bool if symbol is a singleton set | bool | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
ref_container | reference to the Container that the symbol belongs to | Container | - |
summary | output a dict of only the metadata | dict | - |
Set Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
getCardinality | get the full cartesian product of the domain | - | int |
getSparsity | get the sparsity of the symbol w.r.t the cardinality | - | float |
findDomainViolations | get the index of records that contain any domain violations | - | pandas.Index |
isValid | checks if the symbol is in a valid format, throw exceptions if verbose=True , recheck a symbol if force=True | verbose=False force=True | bool |
setRecords | main convenience method to set standard pandas.DataFrame formatted records | records (many types) | None |
Adding Set Records
Three possibilities exist to assign symbol records to a set (roughly ordered in complexity):
- Setting the argument
records
in the set constructor/container method (internally callssetRecords
) - creates a data copy - Using the symbol method
setRecords
- creates a data copy - Setting the property
records
directly - does not create a data copy
If the data is in a convenient format, a user may want to pass the records directly within the set constructor. This is an optional keyword argument and internally the set constructor will simply call the setRecords
method. The symbol method setRecords
is a convenience method that transforms the given data into an approved Pandas DataFrame format (see GAMS Transfer Standard Data Formats). Many native python data types can be easily transformed into DataFrames, so the setRecords
method for Set
objects will accept a number of different types for input. The setRecords
method is called internally on any data structure that is passed through the records
argument. We show a few examples of ways to create differently structured sets:
- Example #1 - Create a 1D set from a list
- Example #2 - Create a 1D set from a tuple
- Example #3 - Create a 2D set from a list of tuples
- Example #4 - Create a 1D set from a DataFrame slice + .unique()
- Note
- The
.unique()
method preserves the order of appearance, unlikeset()
.
Set element text is very handy when labeling specific set elements within a set. A user can add a set element text directly with a set element. Note that it is not required to label all set elements, as can be seen in the following example.
- Example #5 - Add set element text
Directly Set Records
The primary advantage of the setRecords
method is that GAMS Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container
a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.
- Example #1 - Directly set records (1D set)
Stepping through this example we take the following steps:
- Create an empty
Container
- Create a GAMS set
i
in the Container, but do not set therecords
- Create a Pandas DataFrame (manually, in this example) taking care to follow the standard format
- The DataFrame has the right shape and column labels so we can proceed to set the records.
- We need to cast the
uni_0
column as acategorical
data type, so we create a custom ordered categorty type usingpandas.CategoricalDtype
- Finally, we set the records directly by passing a reference to
df_i
into the symbol records attribute. The setter function of.records
checks that a DataFrame is being set, but does not check validity. Thus, as a final step we call the.isValid()
method to verify that the symbol is valid.
- Attention
- Users can debug their DataFrames by running
<symbol_name>.isValid(verbose=True)
to get feedback about their data.
- Example #2 - Directly set records (1D subset)
This example is more subtle in that we want to create a set j
that is a subset of i
. We create the set i
using the setRecords
method but then set the records directly for j
. There are two important details to note: 1) the column labels in df_j
now reflect the standard format for a symbol with a domain set (as opposed to the universe) and 2) we create the categorical dtype by referencing the parent set (i
) for the categories (instead of referencing itself).
Create a Parameter
There are two different ways to create a GAMS parameter and add it to a Container
.
- Use
Parameter
constructor - Use the
Container
methodaddParameter
(which internally calls theParameter
constructor)
Parameter Constructor
- Parameter constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
container | Container | A reference to the Container object that the symbol is being added to | Yes | - |
name | str | Name of symbol | Yes | - |
domain | list | List of domains given either as string ('*' for universe set) or as reference to a Set object, an empty domain list will create a scalar parameter | No | [] |
records | many | Symbol records | No | None |
domain_forwarding | bool | Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | No | False |
description | str | Description of symbol | No | "" |
Parameter Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
is_scalar | True if the len(self.domain) = 0 | bool | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
ref_container | reference to the Container that the symbol belongs to | Container | - |
shape | a tuple describing the array dimensions if records were converted with .toDense() | tuple | - |
summary | output a dict of only the metadata | dict | - |
Parameter Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS in value column | - | int |
countNA | total number of SpecialValues.NA in value column | - | int |
countNegInf | total number of SpecialValues.NEGINF in value column | - | int |
countPosInf | total number of SpecialValues.POSINF in value column | - | int |
countUndef | total number of SpecialValues.UNDEF in value column | - | int |
findDomainViolations | get the index of records that contain any domain violations | - | pandas.Index |
findEps | find index positions of SpecialValues.EPS in value column | - | pandas.Index |
findNA | find index positions of SpecialValues.NA in value column | - | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in value column | - | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in value column | - | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in value column | - | pandas.Index |
getCardinality | get the full cartesian product of the domain | - | int |
getSparsity | get the sparsity of the symbol w.r.t the cardinality | - | float |
getMaxValue | get the maximum value in value column | - | float |
getMinValue | get the minimum value in value column | - | float |
getMeanValue | get the mean value in value column | - | float |
getMaxAbsValue | get the maximum absolute value in value column | - | float |
isValid | checks if the symbol is in a valid format, throw exceptions if verbose=True , recheck a symbol if force=True | verbose=False force=True | bool |
setRecords | main convenience method to set standard pandas.DataFrame records | records (many types) | None |
toDense | convert symbol to a dense numpy.array format | - | numpy.array |
toSparseCoo | convert symbol to a sparse COOrdinate numpy.array format | - | sparse matrix format |
whereMax | find the domain entry of records with a maximum value (return first instance only) | - | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | - | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | - | list of str |
Adding Parameter Records
Three possibilities exist to assign symbol records to a parameter (roughly ordered in complexity):
- Setting the argument
records
in the set constructor/container method (internally callssetRecords
) - creates a data copy - Using the symbol method
setRecords
- creates a data copy - Setting the property
records
directly - does not create a data copy
If the data is in a convenient format, a user may want to pass the records directly within the parameter constructor. This is an optional keyword argument and internally the parameter constructor will simply call the setRecords
method. The symbol method setRecords
is a convenience method that transforms the given data into an approved Pandas DataFrame format (see GAMS Transfer Standard Data Formats). Many native python data types can be easily transformed into DataFrames, so the setRecords
method for Set
objects will accept a number of different types for input. The setRecords
method is called internally on any data structure that is passed through the records
argument. We show a few examples of ways to create differently structured parameters:
- Example #1 - Create a GAMS scalar
- Note
- GAMS Transfer will still convert scalar values to a standard format (i.e., a Pandas DataFrame with a single row and column).
- Example #2 - Create a 1D parameter (defined over *) from a list of tuples
- Example #3 - Create a 1D parameter (defined over a set) from a list of tuples
- Example #4 - Create a 2D parameter (defined over a set) from a DataFrame slice
- Note
- The original indexing is preserved when a user slices rows out of a reference dataframe.
- Example #5 - Create a 2D parameter (defined over a set) from a matrix
- Example #6 - Create a 2D parameter from an array using setRecords
Directly Set Records
As with sets, the primary advantage of the setRecords
method is that GAMS Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container
a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.
- Example #1 - Correctly set records (directly)
In this example we create a large parameter (31,536,000 records and 8880 unique domain elements – we mimic data that is labeled for every second in one year) and assign it to a parameter with a.records
. GAMS Transfer requires that all domain columns must be a categorical data type, furthermore, this categorical must be ordered. The records
setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True
.
Users will need to use the .isValid(verbose=True)
method to debug any structural issues. As an example we incorrectly generate categorical data types by passing the DataFrame constructor the generic dtype="category"
argument. This creates categorical column types but they are not ordered and they do not reference the underlying domain set. These errors result in a
being invalid.
- Example #2 - Incorrectly set records (directly)
Create a Variable
There are two different ways to create a GAMS variable and add it to a Container
.
- Use
Variable
constructor - Use the
Container
methodaddVariable
(which internally calls theVariable
constructor)
Variable Constructor
- Variable constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
container | Container | A reference to the Container object that the symbol is being added to | Yes | - |
name | str | Name of symbol | Yes | - |
type | str | Type of variable being created [binary , integer , positive , negative , free , sos1 , sos2 , semicont , semiint ] | No | free |
domain | list | List of domains given either as string (* for universe set) or as reference to a Set object, an empty domain list will create a scalar variable | No | [] |
records | many | Symbol records | No | None |
domain_forwarding | bool | Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | No | False |
description | str | Description of symbol | No | "" |
Variable Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
ref_container | reference to the Container that the symbol belongs to | Container | - |
shape | a tuple describing the array dimensions if records were converted with .toDense() | tuple | - |
summary | output a dict of only the metadata | dict | - |
type | str type of variable | dict | - |
Variable Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS across all columns | columns="level" (str ,list ) | int |
countNA | total number of SpecialValues.NA across all columns | columns="level" (str ,list ) | int |
countNegInf | total number of SpecialValues.NEGINF across all columns | columns="level" (str ,list ) | int |
countPosInf | total number of SpecialValues.POSINF across all columns | columns="level" (str ,list ) | int |
countUndef | total number of SpecialValues.UNDEF across all columns | columns="level" (str ,list ) | int |
findDomainViolations | get the index of records that contain any domain violations | - | pandas.Index |
findEps | find index positions of SpecialValues.EPS in column | column="level" (str ) | pandas.Index |
findNA | find index positions of SpecialValues.NA in column | column="level" (str ) | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in column | column="level" (str ) | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in column | column="level" (str ) | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in column | column="level" (str ) | pandas.Index |
getCardinality | get the full cartesian product of the domain | - | int |
getSparsity | get the sparsity of the symbol w.r.t the cardinality | - | float |
getMaxValue | get the maximum value across all columns | columns="level" (str ,list ) | float |
getMinValue | get the minimum value across all columns | columns="level" (str ,list ) | float |
getMeanValue | get the mean value across all columns | columns="level" (str ,list ) | float |
getMaxAbsValue | get the maximum absolute value across all columns | columns="level" (str ,list ) | float |
isValid | checks if the symbol is in a valid format, throw exceptions if verbose=True , recheck a symbol if force=True | verbose=False force=True | bool |
setRecords | main convenience method to set standard pandas.DataFrame records | records (many types) | None |
toDense | convert column to a dense numpy.array format | column="level" (str ) | numpy.array |
toSparseCoo | convert column to a sparse COOrdinate numpy.array format | column="level" (str ) | sparse matrix format |
whereMax | find the domain entry of records with a maximum value (return first instance only) | column="level" (str ) | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | column="level" (str ) | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | column="level" (str ) | list of str |
Adding Variable Records
Three possibilities exist to assign symbol records to a variable (roughly ordered in complexity):
- Setting the argument
records
in the set constructor/container method (internally callssetRecords
) - creates a data copy - Using the symbol method
setRecords
- creates a data copy - Setting the property
records
directly - does not create a data copy
If the data is in a convenient format, a user may want to pass the records directly within the variable constructor. This is an optional keyword argument and internally the variable constructor will simply call the setRecords
method. In contrast to the setRecords
methods in in either the Set or Parameter classes the setRecords
method for variables will only accept Pandas DataFrames and specially structured dict
for creating records from matrices. This restriction is out of necessity because to properly set a record for a Variable the user must pass data for the level
, marginal
, lower
, upper
and scale
attributes. That said, any missing attributes will be filled in with the GAMS default record values (see: Variable Types), default scale
value is always 1, and the default level
and marginal
values are 0 for all variable types). We show a few examples of ways to create differently structured variables:
- Example #1 - Create a GAMS scalar variable
- Example #2 - Create a 1D variable (defined over *) from a list of tuples
In this example we only set the marginal
values.
- Example #3 - Create a 1D variable (defined over a set) from a list of tuples
- Example #4 - Create a 2D positive variable, specifying no numerical data
- Example #5 - Create a 2D variable (defined over a set) from a matrix
Directly Set Records
As with sets, the primary advantage of the setRecords
method is that GAMS Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container
a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.
- Example #1 - Correctly set records (directly)
- Attention
- All numeric data in the records will need to be type
float
in order to maintain a valid symbol.
In this example we create a large variable (31,536,000 records and 8880 unique domain elements – we mimic data that is labeled for every second in one year) and assign it to a variable with a.records
. GAMS Transfer requires that all domain columns must be a categorical data type, furthermore this categorical must be ordered. The records
setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True
. As with Set and Parameters, users can use the .isValid(verbose=True)
method to debug any structural issues.
Create an Equation
There are two different ways to create a GAMS equation and add it to a Container
.
- Use
Equation
constructor - Use the
Container
methodaddEquation
(which internally calls theEquation
constructor)
Equation Constructor
- Equation constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
container | Container | A reference to the Container object that the symbol is being added to | Yes | - |
name | str | Name of symbol | Yes | - |
type | str | Type of equation being created [eq (or E /e ), geq (or G /g ), leq (or L /l ), nonbinding (or N /n ), external (or X /x )] | Yes | - |
domain | list | List of domains given either as string (* for universe set) or as reference to a Set/Alias object, an empty domain list will create a scalar equation | No | [] |
records | many | Symbol records | No | None |
domain_forwarding | bool | Flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | No | False |
description | str | Description of symbol | No | "" |
Equation Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
ref_container | reference to the Container that the symbol belongs to | Container | - |
shape | a tuple describing the array dimensions if records were converted with .toDense() | tuple | - |
summary | output a dict of only the metadata | dict | - |
type | str type of variable | dict | - |
Equation Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS across all columns | columns="level" (str ,list ) | int |
countNA | total number of SpecialValues.NA across all columns | columns="level" (str ,list ) | int |
countNegInf | total number of SpecialValues.NEGINF across all columns | columns="level" (str ,list ) | int |
countPosInf | total number of SpecialValues.POSINF across all columns | columns="level" (str ,list ) | int |
countUndef | total number of SpecialValues.UNDEF across all columns | columns="level" (str ,list ) | int |
findDomainViolations | get the index of records that contain any domain violations | - | pandas.Index |
findEps | find index positions of SpecialValues.EPS in column | column="level" (str ) | pandas.Index |
findNA | find index positions of SpecialValues.NA in column | column="level" (str ) | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in column | column="level" (str ) | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in column | column="level" (str ) | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in column | column="level" (str ) | pandas.Index |
getCardinality | get the full cartesian product of the domain | - | int |
getSparsity | get the sparsity of the symbol w.r.t the cardinality | - | float |
getMaxValue | get the maximum value across all columns | columns="level" (str ,list ) | float |
getMinValue | get the minimum value across all columns | columns="level" (str ,list ) | float |
getMeanValue | get the mean value across all columns | columns="level" (str ,list ) | float |
getMaxAbsValue | get the maximum absolute value across all columns | columns="level" (str ,list ) | float |
isValid | checks if the symbol is in a valid format, throw exceptions if verbose=True , recheck a symbol if force=True | verbose=False force=True | bool |
setRecords | main convenience method to set standard pandas.DataFrame records | records (many types) | None |
toDense | convert column to a dense numpy.array format | column="level" (str ) | numpy.array |
toSparseCoo | convert column to a sparse COOrdinate numpy.array format | column="level" (str ) | sparse matrix format |
whereMax | find the domain entry of records with a maximum value (return first instance only) | column="level" (str ) | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | column="level" (str ) | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | column="level" (str ) | list of str |
Adding Equation Records
Adding equation records mimics that of variables – three possibilities exist to assign symbol records to an equation (roughly ordered in complexity):
- Setting the argument
records
in the set constructor/container method (internally callssetRecords
) - creates a data copy - Using the symbol method
setRecords
- creates a data copy - Setting the property
records
directly - does not create a data copy
Setting equation records require the user to be explicit with the type of equation that is being created; in contrast to setting variable records (where the default variable is considered to be free
).
If the data is in a convenient format, a user may want to pass the records directly within the equation constructor. This is an optional keyword argument and internally the equation constructor will simply call the setRecords
method. In contrast to the setRecords
methods in in either the Set or Parameter classes the setRecords
method for variables will only accept Pandas DataFrames and specially structured dict
for creating records from matrices. This restriction is out of necessity because to properly set a record for an Equation the user must pass data for the level
, marginal
, lower
, upper
and scale
attributes. That said, any missing attributes will be filled in with the GAMS default record values (level = 0.0
, marginal = 0.0
, lower = -inf
, upper = inf
, scale = 1.0
). We show a few examples of ways to create differently structured variables:
- Example #1 - Create a GAMS scalar equation
- Example #2 - Create a 1D Equation (defined over *) from a list of tuples
In this example we only set the marginal
values.
- Example #3 - Create a 1D Equation (defined over a set) from a list of tuples
- Example #4 - Create a 2D equation, specifying no numerical data
- Example #5 - Create a 2D equation (defined over a set) from a matrix
Directly Set Records
As with set, parameters and variables, the primary advantage of the setRecords
method is that GAMS Transfer will convert many different (and convenient) data types into the standard data format (a Pandas DataFrame). Users that require higher performance will want to directly pass the Container
a reference to a valid Pandas DataFrame, thereby skipping some of these computational steps. This places more burden on the user to pass the data in a valid standard form, but it speeds the records setting process and it avoids making a copy of the data in memory. In this section we walk the user through an example of how to set records directly.
- Example #1 - Correctly set records (directly)
- Attention
- All numeric data in the records will need to be type
float
in order to maintain a valid symbol.
In this example we create a large equation (31,536,000 records and 8880 unique domain elements) and assign it to a variable with a.records
. GAMS Transfer requires that all domain columns must be a categorical data type, furthermore this categorical must be ordered. The records
setter function does very little work other than checking if the object being set is a DataFrame. This places more responsibility on the user to create a DataFrame that complies with the standard format. In Example #1 we take care to properly reference the categorical data types from the domain sets – and in the end a.isValid() = True
. As with Set and Parameters, users can use the .isValid(verbose=True)
method to debug any structural issues.
Create an Alias
There are two different ways to create a GAMS equation and add it to a Container
.
- Use
Alias
constructor - Use the
Container
methodaddAlias
(which internally calls theAlias
constructor)
Alias Constructor
- Alias constructor
Argument | Type | Description | Required | Default |
---|---|---|---|---|
container | Container | A reference to the Container object that the symbol is being added to | Yes | - |
name | str | Name of symbol | Yes | - |
alias_with | Set object | set object from which to create an alias | Yes | - |
- Example - Creating an alias from a set
GAMS Transfer only stores the reference to the parent set as part of the alias structure – most properties that are called from an alias object simply point to the properties of the parent set (with the exception of ref_container
, name
, and alias_with
). It is possible to create an alias from another alias object. In this case a recursive search will be performed to find the root parent set – this is the set that will ultimately be stored as the alias_with
property. We can see this behavior in the following example:
Alias Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
alias_with | aliased object | Set | - |
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
is_singleton | if symbol is a singleton set | bool | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
ref_container | reference to the Container that the symbol belongs to | Container | - |
summary | output a dict of only the metadata | dict | - |
Alias Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
getCardinality | get the full cartesian product of the domain | - | int |
getSparsity | get the sparsity of the symbol w.r.t the cardinality | - | float |
isValid | checks if the symbol is in a valid format, throw exceptions if verbose=True , recheck a symbol if force=True | verbose=False force=True | bool |
setRecords | main convenience method to set standard pandas.DataFrame formatted records | records (many types) | None |
Adding Alias Records
The linked structure of Aliases offers some unique opportunies to access some of the setter functionality of the parent set. Specifically, GAMS Transfer allows the user to change the domain
, description
, dimension
, and records
of the underlying parent set as a shorthand notation. We can see this behavior if we look at a modified Example #1 from Adding Set Records.
- Example - Creating set records through an alias link
- Note
- An alias
.isValid()=True
when the underlying parent set is also valid – if the parent set is removed from the Container the alias will no longer be valid.
Validating Data
GAMS Transfer requires that the records for all symbols exist in a standard format (GAMS Transfer Standard Data Formats) in order for them to be understood and written successfully. It is certainly possible that the data could end up in a state that is inconsistent with the standard format (especially if setting symbol attributes directly). GAMS Transfer includes the .isValid()
method in order to determine if a symbol is valid and ready for writing; this method returns a bool
. For example, we create two valid sets and then check them with .isValid()
to be sure.
- Note
- It is possible to run
.isValid()
on both theContainer
as well as the symbol object –.isValid()
will also return abool
if there are any invalid symbols in theContainer
object.
- Example (valid data)
Now we create some data that is invalid due to domain violations in the set j
.
- Example (intentionally create domain violations)
In this example, we know that the validity of the data is compromised by the domain violations, but there could be other subtle discrepancies that must be remedied before writing data. The user can get more detailed error reporting if the verbose
argument is set to True
. For example:
The .isValid()
method checks:
- If the symbol belongs to a Container
- If all domain set symbols exist in the Container
- If all domain set symbols objects are valid
- If records are a DataFrame (or
None
) - The shape of the records is congruent with the dimensionality of the symbol
- If records column headings are in stanard format
- If all domain columns are type
category
and also ordered - If all domain categorical dtypes are referenced properly (
.records
for referenced domain sets cannot beNone
in order to create categoricals properly) - If there are any domain violations
- If there are any duplicate domain members
- That all data columns are type
float
- To make sure that all domain categories are type
str
Domain Forwarding
GAMS includes the ability to define sets directly from data using the implicit set notation (see: Implicit Set Definition (or: Domain Defining Symbol Declarations)). This notation has an analogue in GAMS Transfer called domain_forwarding
.
- Note
- It is possible to recursively update a subset tree in GAMS Transfer.
Domain forwarding is available as an argument to all symbol object constructors; the user would simply need to pass domain_forwarding=True
.
In this example we have raw data that in the dist
DataFrame and we want to send the domain information into the i
and j
sets – we take care to pass the set objects as the domain for parameter c
.
- Note
- The element order in the sets
i
andj
mirrors that in the raw data.
In this example we show that domain forwarding will also work recursively to update the entire set lineage – the domain forwarding occurs at the creation of every symbol object. The correct order of elements in set i
is [z, a, b, c]
because the records from j
are forwarded first, and then the records from k
are propagated through (back to i
).
Describing Data
The methods describeSets
, describeParameters
, describeVariables
, and describeEquations
allow the user to get a summary view of key data statistics. The returned DataFrame aggregates the output for a number of other methods (depending on symbol type). A description of each Container
method is provided in the following subsections:
describeSets
Argument | Type | Description | Required | Default |
---|---|---|---|---|
symbols | list , str , NoneType | A list of sets in the Container to include in the output. describeSets will include aliases if they are explicitly passed by the user. | No | None (if None specified, will assume all sets – not aliases) |
Returns: pandas.DataFrame
The following table includes a short description of the column headings in the return.
Property / Statistic | Description |
---|---|
name | name of the symbol |
is_singleton | bool if the set/alias is a singleton set (or an alias of a singleton set) |
alias_with | [OPTIONAL if users passes an alias name as part of symbols ] name of the parent set (for alias only), None otherwise |
domain | domain labels for the symbol |
domain_type | none , relaxed or regular depending on the symbol state |
dim | dimension |
num_recs | number of records in the symbol |
cardinality | cartesian product of the domain information |
sparsity | 1 - num_recs/cardinality |
- Example #1
- Example #2 – with aliases
describeParameters
Argument | Type | Description | Required | Default |
---|---|---|---|---|
symbols | list , str , NoneType | A list of parameters in the Container to include in the output | No | None (if None specified, will assume all parameters) |
Returns: pandas.DataFrame
The following table includes a short description of the column headings in the return.
Property / Statistic | Description |
---|---|
name | name of the symbol |
is_scalar | bool if the symbol is a scalar (i.e., dimension = 0) |
domain | domain labels for the symbol |
domain_type | none , relaxed or regular depending on the symbol state |
dim | dimension |
num_recs | number of records in the symbol |
min_value | min value in data |
mean_value | mean value in data |
max_value | max value in data |
where_min | domain of min value (if multiple, returns only first occurance) |
where_max | domain of max value (if multiple, returns only first occurance) |
count_eps | number of SpecialValues.EPS in data |
count_na | number of SpecialValues.NA in data |
count_undef | number of SpecialValues.UNDEF in data |
cardinality | cartesian product of the domain information |
sparsity | 1 - num_recs/cardinality |
- Example
describeVariables
Argument | Type | Description | Required | Default |
---|---|---|---|---|
symbols | list , str , NoneType | A list of variables in the Container to include in the output | No | None (if None specified, will assume all variables) |
Returns: pandas.DataFrame
The following table includes a short description of the column headings in the return.
Property / Statistic | Description |
---|---|
name | name of the symbol |
type | type of variable (i.e., binary ,integer ,positive ,negative ,free ,sos1 ,sos2 ,semicont ,semiint ) |
domain | domain labels for the symbol |
domain_type | none , relaxed or regular depending on the symbol state |
dim | dimension |
num_recs | number of records in the symbol |
cardinality | cartesian product of the domain information |
sparsity | 1 - num_recs/cardinality |
min_level | min value in the level |
mean_level | mean value in the level |
max_level | max value in the level |
where_max_abs_level | domain of max(abs(level )) in data |
count_eps_level | number of SpecialValues.EPS in level |
min_marginal | min value in the marginal |
mean_marginal | mean value in the marginal |
max_marginal | max value in the marginal |
where_max_abs_marginal | domain of max(abs(marginal )) in data |
count_eps_marginal | number of SpecialValues.EPS in marginal |
- Example
describeEquations
Argument | Type | Description | Required | Default |
---|---|---|---|---|
symbols | list , str , NoneType | A list of equations in the Container to include in the output | No | None (if None specified, will assume all equations) |
Returns: pandas.DataFrame
The following table includes a short description of the column headings in the return.
Property / Statistic | Description |
---|---|
name | name of the symbol |
type | type of variable (i.e., binary , integer , positive , negative , free , sos1 , sos2 , semicont , semiint ) |
domain | domain labels for the symbol |
domain_type | none , relaxed or regular depending on the symbol state |
dim | dimension |
num_recs | number of records in the symbol |
cardinality | cartesian product of the domain information |
sparsity | 1 - num_recs/cardinality |
min_level | min value in the level |
mean_level | mean value in the level |
max_level | max value in the level |
where_max_abs_level | domain of max(abs(level )) in data |
count_eps_level | number of SpecialValues.EPS in level |
min_marginal | min value in the marginal |
mean_marginal | mean value in the marginal |
max_marginal | max value in the marginal |
where_max_abs_marginal | domain of max(abs(marginal )) in data |
count_eps_marginal | number of SpecialValues.EPS in marginal |
- Example
describeAliases
Argument | Type | Description | Required | Default |
---|---|---|---|---|
symbols | list , str , NoneType | A list of alias (only) symbols in the Container to include in the output | No | None (if None specified, will assume all aliases – not sets) |
Returns: pandas.DataFrame
The following table includes a short description of the column headings in the return. All data is referenced from the parent set that the alias is created from.
Property / Statistic | Description |
---|---|
name | name of the symbol |
is_singleton | bool if the set/alias is a singleton set (or an alias of a singleton set) |
alias_with | name of the parent set (for alias only), None otherwise |
domain | domain labels for the symbol |
domain_type | none , relaxed or regular depending on the symbol state |
dim | dimension |
num_recs | number of records in the symbol |
cardinality | cartesian product of the domain information |
sparsity | 1 - num_recs/cardinality |
- Example
Matrix Generation
GAMS Transfer stores data in a "flat" format, that is, one record entry per DataFrame row. However, it is often necessary to convert this data format into a matrix format – GAMS Transfer enables users to do this with relative ease using the toDense
and the toSparseCoo
symbol methods. The toDense
method will return a dense N
-dimensional numpy array with each dimension corresponding to the GAMS symbol dimension; it is possible to output an array up to 20 dimensions (a GAMS limit). The toSparseCoo
method will return the data in a sparse scipy COOrdinate format, which can then be efficiently converted into other sparse matrix formats.
- Attention
- Both the
toDense
andtoSparseCoo
methods do not transform the underlying DataFrame in any way, they only return the transformed data.
- Note
toSparseCoo
will only convert 2-dimensional data to the scipy COOrdinate format. A user interested in sparse data for an N-dimensional symbol will need to decide how to reshape the dense array in order to generate the 2D sparse format.
- Attention
- In order to use the
toSparseCoo
method the user will need to install the scipy package. Scipy is not provided with GMSPython.
Both the toDense
and toSparseCoo
method leverage the indexing that comes along with using categorical
data types to store domain information. This means that linking symbols together (by passing symbol objects as domain information) impacts the size of the matrix. This is best demonstrated by a few examples.
- Example (1D data w/o domain linking (i.e., a relaxed domain))
Note that the parameter a
is not linked to another symbol, so when converting to a matrix, the indexing is referenced to the data structure in a.records
. Defining a sparse parameter a
over a set i
allows us to extract information from the i
domain and construct a very different dense matrix, as the following example shows:
- Example (1D data w/ domain linking (i.e., a regular domain))
- Example (2D data w/ domain linking)
The Universe Set
A Unique Element List (UEL) (aka the "universe" or "universe set") is an (i,s)
pair where i
is an identification number for a string s
. GAMS uses UELs to efficiently store domain entries of a record by storing the UEL ID i
of a domain entry instead of the actual string s
. This avoids storing the same string multiple times. The concept of UELs also exists in Python/Pandas and is called a "categorical array". GAMS Transfer leverages these types in order to efficiently store strings and enable domain checking within the Python environment.
Each domain column in a DataFrame can be assigned a unique categorical type, the effect is that each symbol maintains its own list of UELs per dimension. It is possible to convert a categorical column to its ID number representation by using the categorical accessor x.records[<domain_column_label>].cat.codes
; however, this type of data manipulation is not necessary within GAMS Transfer, but could be handy when debugging data.
Pandas offers the possibility to create categorical column types that are ordered
or not; GAMS Transfer relies exclusively on ordered
categorical data types (in order for a symbol to be valid it must have only ordered
categories). By using ordered categories, GAMS Transfer will order the UEL such that elements appear in the order in which they appeared in the data (which is how GAMS defines the UEL). GAMSTransfer
allows the user to reorder the UELs with the uel_priority
argument in the .write()
method.
GAMS Transfer does not actually keep track of the UEL separately from other symbols in the Container
, it will be created interal to the .write()
method and is based on the order in which data is added to the container. The user can access the current state of the UEL with the .getUniverseSet()
container method. For example, we set a two dimensional set:
Pandas also includes a number of methods that allow categories to be renamed, appended, etc. These methods may be useful for advanced users, but most users will probably find that modifying the original data structures and resetting the symbol records provides a simpler solution. The design of GAMS Transfer should enable the user to quickly move data back and forth, without worrying about the deeper mechanics of categorical data.
Reordering Symbols
The order of the Container file requires the symbols to be sorted such that, for example, a Set used as domain of another symbol appears before that symbol. The Container will try to establish a valid ordering when writing the data. This type of situation could be encountered if the user is adding and removing many symbols (and perhaps rewriting symbols with the same name) – users should attempt to only add symbols to a Container
once, and care must be taken when creating symbol names. The method reorderSymbols
attempts to fix symbol ordering problems. The following example shows how this can occur:
- Example Symbol reordering
The symbols are now out of order in .data
and must be reordered:
Rename Symbols
It is possible to rename a symbol even after it has been added to a Container
. There are two methods that can be used to achieve the desired outcome:
- using the container method
renameSymbol
- directly changing the
name
symbol property
We create a Container
with two sets:
- Example #1 - Change the name of a symbol with the container method
- Example #2 - Change the name of a symbol with the .name attribute
- Note
- Note that the renamed symbols maintain the original symbol order, this will prevent unnecessary reordering operations later in the workflow.
Removing Symbols
Removing symbols from a container is easy when using the removeSymbols
container method; this method accepts either a str
or a list
of str
.
- Attention
- Once a symbol has been removed, it is possible to have hanging references as domain links in other symbols. The user will need to repair these other symbols with the proper domain links in order to avoid validity errors.
Full Example
It is possible to use everything we now know about GAMS Transfer to recreate the `trnsport.gms` results in GDX form. As part of this example we also introduce the write
method (and generate new.gdx
). We will discuss it in more detail in the following section: Data Exchange with GDX.
Advanced Examples
With the wide range of I/O tools included in Pandas it is possible to easly draw data down from CSV data sources. We provide two examples here that pull data into GAMS Transfer from an HTML source and a POSTGRES SQL source.
- Example #1 - Create symbols from HTML
- Note
- Users can chain Pandas operations together and pass those operations through to the
records
argument or thesetRecords
method.
- Example #2 - Create symbols from a POSTGRES SQL Database (sqlalchemy)
GAMS Special Values
The GAMS system contains five special values: UNDEF
(undefined), NA
(not available), EPS
(epsilon), +INF
(positive infinity), -INF
(negative infinity). These special values must be mapped to their Python equivalents. GAMS Transfer follows the following convention to generate the 1:1
mapping:
+INF
is mapped tofloat("inf")
-INF
is mapped tofloat("-inf")
EPS
is mapped to-0.0
(mathematically identical to zero)NA
is mapped to a specialNaN
UNDEF
is mapped tofloat("nan")
GAMS Transfer syntax is designed to quickly get data into a form that is usable in further analyses or visualization; this mapping also highlights the preference for data that is of type float
, which offers performance benefits within Pandas/NumPy. The user does not need to remember these constants as they are provided within the class SpecialValues
as SpecialValues.POSINF
, SpecialValues.NEGINF
, SpecialValues.EPS
, SpecialValues.NA
, and SpecialValues.UNDEF
. The SpecialValues
class also contains methods to test for these special values. Some examples are shown below; already, we, begin to introduce some of the GAMS Transfer syntax.
- Example (special values in a parameter)
The following DataFrame for x
would look like:
The user can now easily test for specific special values in the value
column of the DataFrame (returns a boolean array):
Other data structures can be passed into these methods as long as these structures can be converted into a numpy array with dtype=float
. It follows that:
Pandas DataFrames allow data columns to exist with mixed type (dtype=object
) – GAMS Transfer leverages this convenience feature to enable users to import string representations of EPS
, NA
, and UNDEF
. GAMS Transfer is tolerant of any mixed-case special value string representation. Python offers additional flexiblity when representing negative/positive infinity. Any string x
where float(x) == float("inf")
evaluates to True can be used to represent positive infinity. Similarly, any string x
where float(x) == float("-inf")
evaluates to True can be used to represent negative infinity. Allowed values include inf
, +inf
, INFINITY
, +INFINITY
, -inf
, -INFINITY
and all mixed-case eqivalents.
- Example (special values defined by strings)
These special strings will be immediately mapped to their float
equivalents from the SpecialValues
class in order to ensure that all data entries are float types.
GAMS Transfer Standard Data Formats
This section is meant to introduce the standard format that GAMS Transfer expects for symbol records. It has already been mentioned that we store data as a Pandas DataFrame, but there is an assumed structure to the column headings and column types that will be important to understand. GAMS Transfer includes convenience functions in order to ease the burden of converting data from a user-centric format to one that is understood by GAMS Transfer. However, advanced users will want to convert their data first and add it directly to the Container to avoid making extra copies of (potentially large) data sets.
- Set Records Standard Format
All set records (including singleton sets) are stored as a Pandas DataFrame with n
number of columns, where n
is the dimensionality of the symbol + 1. The first n-1
columns include the domain elements while the last column includes the set element explanatory text. Records are organized such that there is one record per row.
The names of the domain columns follow a pattern of <set_name>_<index_position>
; a symbol dimension that is referenced to the universe is labeled uni_<index position>
. The explanatory text column is called element_text
and must take the last position in the DataFrame.
All domain columns must be a categorical data type and the element_text
column must be a object
type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however GAMS Transfer requires that all these are type str
. All rows in the element_text
column must be type str
.
Some examples:
- Parameter Records Standard Format
All parameter records (including scalars) are stored as a Pandas DataFrame with n
number of columns, where n
is the dimensionality of the symbol + 1. The first n-1
columns include the domain elements while the last column includes the numerical value of the records. Records are organized such that there is one record per row. Scalar parameters have zero dimension, therefore they only have one column and one row.
The names of the domain columns follow a pattern of <set_name>_<index_position>
; a symbol dimension that is referenced to the universe is labeled uni_<index_position>
. The value column is called value
and must take the last position in the DataFrame.
All domain columns must be a categorical data type and the value
column must be a float
type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however GAMS Transfer requires that all these are type str
.
Some examples:
- Variable/Equation Records Standard Format
Variables and equations share the same standard data format. All records (including scalar variables/equations) are stored as a Pandas DataFrame with n
number of columns, where n
is the dimensionality of the symbol + 5. The first n-5
columns include the domain elements while the last five columns include the numerical values for different attributes of the records. Records are organized such that there is one record per row. Scalar variables/equations have zero dimension, therefore they have five columns and one row.
The names of the domain columns follow a pattern of <set_name>_<index position>
; a symbol dimension that is referenced to the universe is labeled uni_<index_position>
. The attribute columns are called level
, marginal
, lower
, upper
, and scale
. These attribute columns must appear in this order. Attributes that are not supplied by the user will be assigned the default GAMS values for that variable/equation type; it is possible to not pass any attributes, GAMS Transfer would then simply assign default values to all attributes.
All domain columns must be a categorical data type and all the attribute columns must be a float
type. Pandas allows the categories (basically the unique elements of a column) to be various data types as well, however GAMS Transfer requires that all these are type str
.
Some examples:
Data Exchange with GDX
Up until now, we have been focused on using GAMS Transfer to create symbols in an empty Container
using the symbol constructors (or their corresponding container methods). These tools will enable users to ingest data from many different formats and add them to a Container
– however, it is also possible to read in symbol data directly from GDX files using the read
container method. In the following sections, we will discuss this method in detail as well as the write
method, which allows users to write out to new GDX files.
Reading from GDX
There are two main ways to read in GDX based data.
- Pass the file path directly to the Container constructor (will read all symbols and records)
- Pass the file path directly to the
read
method (default read all symbols, but can read partial files)
The first option here is provided for convenience and will, internally, call the read
method. This method will read in all symbols as well as their records. This is the easiest and fastest way to get data out of a GDX file and into your Python environment. For the following examples we leverage the GDX output generated from the `trnsport.gms` model file.
- Example (reading full data w/ Container constructor)
A user could also read in data with the read
method as shown in the following example.
- Example (reading full data w/ read method)
It is also possible to read in a partial GDX file with the read
method, as shown in the following example:
This syntax assumes that the user will always want to read in both the metadata as well as the actual data records, but it is possible to skip the reading of the records by passing the argument records=False
.
- Attention
- The
read
method attempts to link the domain objects together (in order to have a "regular"domain_type
) but if domain sets are not part of the read operation there is no choice but to default to a "relaxed"domain_type
. This can be seen in the last example where we only read in the variablex
and not the domain sets (i
andj
) that the variable is defined over. All the data will be available to the user, but domain checking is no longer possible. The symbolx
will remain with "relaxed" domain type even if the user were to read in setsi
andj
in a secondread
call.
Writing to GDX
A user can write data to a GDX file by simply passing a file path (as a string). The write
method will then create the GDX and write all data in the Container
.
- Note
- It is not possible to write the
Container
when any of its symbols are invalid. If any symbols are invalid an error will be raised and the user will need to inspect the problematic symbols (perhaps using a combination of thelistSymbols(isValid=False)
andisValid(verbose=True)
methods).
- Example
- Example (write a compressed GDX file)
Advanced users might want to specify an order to their UEL list (i.e., the universe set); recall that the UEL ordering follows that dictated by the data. As a convenience, it is possible to prepend the UEL list with a user specified order using the uel_priority
argument.
- Example (change the order of the UEL)
The original UEL order for this GDX file would have been ["a", "b", "c"]
, but this example reorders the UEL with uel_priority
– the positions of b
and c
have been swapped. This can be verified with the gdxdump
utility (using the uelTable
argument):
gdxdump foo.gdx ueltable=foo Set foo / 'a' , 'c' , 'b' /; $onEmpty Set i(*) / 'a', 'c', 'b' /; $offEmpty
Data Exchange with GamsDatabase and GMD Objects (Embedded Python Code)
We have discussed how to create symbols in an empty Container
and we have discussed how to exchange data with GDX files, however it is also possible to read and write data directly in memory by interacting with a GamsDatabase/GMD object – this allows GAMS Transfer to be used to read/write data within an Embedded Python Code environment or in combination with the Python OO API. There are some important differences when compared to data exchange with GDX since we are working with data representations in memory.
Reading from GamsDatabase and GMD Objects
Just as with a GDX, there are two main ways to read in data that is in a GamsDatabase/GMD object.
- Pass the GamsDatabase/GMD object directly to the Container constructor (will read all symbols and records)
- Pass the GamsDatabase/GMD object directly to the
read
method (default read all symbols, but can read partial files)
The first option here is provided for convenience and will, internally, call the read
method. This method will read in all symbols as well as their records. This is the easiest and fastest way to get data out of a GamsDatabase/GMD object and into your Python environment. While it is possible to generate a custom GamsDatabase/GMD object from scratch (using the gmdcc
API), most users will be interacting with a GamsDatabase/GMD object that has already been instantiated internally when he/she is using Embedded Python Code or the GamsDatabase class in the Python OO API. Our examples will show how to access the GamsDatabase/GMD object – we leverage the some of the data from the `trnsport.gms` model file.
- Example (reading full data w/ Container constructor)
- Note
- Embedded Python Code users will want pass the GamsDatabase object that is part of the GAMS Database object – this will always be referenced as
gams.db
regardless of the model file.
The following example uses embedded Python code to create a new Container, read in all symbols, and display some summary statistics as part of the gams log output.
Set
i 'canning plants' / seattle, san-diego /
j 'markets' / new-york, chicago, topeka /;
Parameter
a(i) 'capacity of plant i in cases'
/ seattle 350
san-diego 600 /
b(j) 'demand at market j in cases'
/ new-york 325
chicago 300
topeka 275 /;
Table d(i,j) 'distance in thousands of miles'
new-york chicago topeka
seattle 2.5 1.7 1.8
san-diego 2.5 1.8 1.4;
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container(gams.db)
print(m.describeSets())
print(m.describeParameters())
$offEmbeddedCode
The gams log output will then look as such (the extra print
calls are just providing nice spacing for this example):
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- matrix.gms(29) 3 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib name is_singleton domain domain_type dim num_recs cardinality sparsity 0 i False [*] none 1 2 None None 1 j False [*] none 1 3 None None name is_scalar domain domain_type dim num_recs min_value mean_value max_value where_min where_max count_eps count_na count_undef cardinality sparsity 0 a False [i] regular 1 2 350.0 475.00 600.0 [seattle] [san-diego] 0 0 0 2 0.0 1 b False [j] regular 1 3 275.0 300.00 325.0 [topeka] [new-york] 0 0 0 3 0.0 2 d False [i, j] regular 2 6 1.4 1.95 2.5 [san-diego, topeka] [seattle, new-york] 0 0 0 6 0.0 --- Starting execution - empty program *** Status: Normal completion [3 rows x 16 columns] --- Starting execution - empty program *** Status: Normal completion
A user could also read in a subset of the data located in the GamsDatabase object with the read
method as shown in the following example. Here we only read in the sets i
and j
, as a result the .describeParameters()
method will return None
.
- Example (reading subset of full data w/ read method)
Set
i 'canning plants' / seattle, san-diego /
j 'markets' / new-york, chicago, topeka /;
Parameter
a(i) 'capacity of plant i in cases'
/ seattle 350
san-diego 600 /
b(j) 'demand at market j in cases'
/ new-york 325
chicago 300
topeka 275 /;
Table d(i,j) 'distance in thousands of miles'
new-york chicago topeka
seattle 2.5 1.7 1.8
san-diego 2.5 1.8 1.4;
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container()
m.read(gams.db, symbols=["i","j"])
gams.printLog("")
print(m.describeSets())
print(m.describeParameters())
$offEmbeddedCode
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- matrix.gms(29) 3 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- name is_singleton domain domain_type dim num_recs cardinality sparsity 0 i False [*] none 1 2 None None 1 j False [*] none 1 3 None None None --- Starting execution - empty program *** Status: Normal completion
All the typical functionality of the Container exists when working with GamsDatabase/GMD objects. This means that domain linking, matrix conversion, and other more advanced options are available to the user at either compilation time or execution time (depending on the Embedded Code syntax being used, see: Syntax). The next example generates a 1000x1000 matrix and then takes its inverse using the Numpy linalg
package.
- Example (Matrix Generation and Inversion)
set i / i1*i1000 /;
alias(i,j);
parameter a(i,j);
a(i,j) = 1 / (ord(i)+ord(j) - 1);
a(i,i) = 1;
embeddedCode Python:
import gamstransfer as gt
import numpy as np
import time
gams.printLog("")
s = time.time()
m = gt.Container(gams.db)
gams.printLog(f"read data: {round(time.time() - s, 3)} sec")
s = time.time()
A = m.data["a"].toDense()
gams.printLog(f"create matrix A: {round(time.time() - s, 3)} sec")
s = time.time()
invA = np.linalg.inv(A)
gams.printLog(f"generate inv(A): {round(time.time() - s, 3)} sec")
endEmbeddedCode
- Note
- In this example, the assignment of the
a
parameter is done during execution time so we must use the execution time syntax for embedded code in order to get the numerical records properly.
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- test.gms(27) 3 Mb --- Starting execution: elapsed 0:00:00.003 --- test.gms(9) 36 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- --- read data: 1.1 sec --- create matrix A: 0.02 sec --- generate inv(A): 0.031 sec *** Status: Normal completion
We will extend this example in the next section to write the inverse matrix A
back into a GAMS parameter.
Writing Data to GamsDatabase and GMD
A user can write to a GamsDatabase/GMD object with the .write()
method just as he/she would write a GDX file – however there are some important differences. When a user writes a GDX file the entire GDX file represents a complete data environment (all domains have been resolved, etc.) thus, GAMS Transfer does not need to worry about merge/replace operations. It is possible to merge/replace symbol records when a user is writing data to in-memory data representations with GamsDatabase/GMD. We show a few examples to illustrate this behavior.
- Example (Populating a set in GAMS)
* note that we need to declare the set i over "*" in order to provide hints about the symbol dimensionality
set i(*);
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db)
$offEmbeddedCode i
embeddedCode Python:
import gamstransfer as gt
m = gt.Container(gams.db)
gams.printLog("")
print(m.data["i"].records)
endEmbeddedCode
- Note
- In general, it is possible to use GAMS Transfer to create new symbols in a GamsDatabase and GMD object (and not necessarily merge symbols) but embedded code best practices necessitate the declaration of any GAMS symbols on the GAMS side first, then the records can be filled with GAMS Transfer.
If we break down this example we can see that the set i
is declared within GAMS (with no records) and then the records for i
are set by writing a Container
to the gams.db
GamsDatabase object (we do this at compile time). The second embedded Python code block runs at execution time and is simply there to read all the records on the set i
– printing the sets this way adds the output to the .log
file (we could also use the more common display i;
operation in GAMS to display the set elements in the LST file).
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- test.gms(10) 2 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- test.gms(20) 3 Mb --- Starting execution: elapsed 0:00:01.464 --- test.gms(13) 4 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- uni_0 element_text 0 i0 1 i1 2 i2 3 i3 4 i4 5 i5 6 i6 7 i7 8 i8 9 i9 *** Status: Normal completion
- Example (Merging set records)
set i / i1, i2 /;
$onmulti
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db, merge_symbols="i")
$offEmbeddedCode i
$offmulti
embeddedCode Python:
import gamstransfer as gt
m = gt.Container(gams.db)
gams.printLog("")
print(m.data["i"].records)
endEmbeddedCode
In this example we need to make use of $onMulti/$offMulti in order to merge new set elements into the the set i
(the same would be true if we were merging other symbol types) – any symbol that already has records defined (in GAMS) and is being added to with Python (and GAMS Transfer) must be wrapped with $onMulti/$offMulti. As with the previous example, the second embedded Python code block runs at execution time and is simply there to read all the records on the set i
. Note that the UEL order will be different in this case (i1
and i2
come before i0
).
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- test.gms(11) 3 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- test.gms(21) 3 Mb --- Starting execution: elapsed 0:00:01.535 --- test.gms(14) 4 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- uni_0 element_text 0 i1 1 i2 2 i0 3 i3 4 i4 5 i5 6 i6 7 i7 8 i8 9 i9 *** Status: Normal completion
- Example (Replacing set records)
set i / x1, x2 /;
$onmultiR
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
m.write(gams.db)
$offEmbeddedCode i
$offmulti
embeddedCode Python:
import gamstransfer as gt
m = gt.Container(gams.db)
gams.printLog("")
print(m.data["i"].records)
endEmbeddedCode
In this example we want to replace the x1
and x2
set elements and built up a totally new element list with set elements from the Container
. Instead of $onMulti
/$offMulti
we must use $onMultiR
/$offMulti
to ensure that the replacement happens in GAMS; we also need to remove the set i
from the merge_symbols
argument.
- Attention
- If the user seeks to replace all records in a symbol they must use the
$onMultiR
syntax. It is not sufficient to simply remove them from themerge_symbols
argument in GAMS Transfer. If the user mistakenly uses$onMulti
the symbols will end up merging without total replacement.
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- test.gms(11) 3 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- test.gms(21) 3 Mb --- Starting execution: elapsed 0:00:01.482 --- test.gms(14) 4 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- uni_0 element_text 0 i0 1 i1 2 i2 3 i3 4 i4 5 i5 6 i6 7 i7 8 i8 9 i9 *** Status: Normal completion
- Example (Merging parameter records)
set i;
parameter a(i<) /
i1 1.23
i2 5
/;
$onmulti
$onembeddedCode Python:
import gamstransfer as gt
m = gt.Container()
i = gt.Set(m, "i", records=["i"+str(i) for i in range(10)])
a = gt.Parameter(m, "a", domain=i, records=[("i"+str(i),i) for i in range(10)])
m.write(gams.db, merge_symbols="a")
$offEmbeddedCode i, a
$offmulti
embeddedCode Python:
import gamstransfer as gt
m = gt.Container(gams.db)
gams.printLog("")
print(m.data["a"].records)
endEmbeddedCode
In this example we also need to make use of $onMulti
/$offMulti
in order to merge new set elements into the the set i
, however the set i
also needs to contain the elements that are defined in the parameter – here we make use of the <
operator that will add the set elements from a(i)
into the set i
- Note
- It would also be possible to run this example by explicitly defining the
set i /i1, i2/;
before the parameter declaration.
- Attention
- GAMS Transfer will overwrite all duplicate records when merging. The original values of
a("i1")
anda("i2")
have been replaced with their new values when writing the Container in this example (see output below).
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- test.gms(16) 3 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- test.gms(25) 3 Mb --- Starting execution: elapsed 0:00:01.467 --- test.gms(19) 4 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- i_0 value 0 i1 1.0 1 i2 2.0 2 i3 3.0 3 i4 4.0 4 i5 5.0 5 i6 6.0 6 i7 7.0 7 i8 8.0 8 i9 9.0 *** Status: Normal completion
- Example (Advanced Matrix Generation and Inversion w/ Write Operation)
set i / i1*i1000 /;
alias(i,j);
parameter a(i,j);
a(i,j) = 1 / (ord(i)+ord(j) - 1);
a(i,i) = 1;
parameter inv_a(i,j);
parameter ident(i,j);
embeddedCode Python:
import gamstransfer as gt
import numpy as np
import time
gams.printLog("")
gams.printLog("")
s = time.time()
m = gt.Container(gams.db)
gams.printLog(f"read data: {round(time.time() - s, 3)} sec")
s = time.time()
A = m.data["a"].toDense()
gams.printLog(f"create matrix A: {round(time.time() - s, 3)} sec")
s = time.time()
invA = np.linalg.inv(A)
gams.printLog(f"calculate inv(A): {round(time.time() - s, 3)} sec")
s = time.time()
m.data["inv_a"].setRecords(invA)
gams.printLog(f"convert matrix to records for inv(A): {round(time.time() - s, 3)} sec")
s = time.time()
I = np.dot(A,invA)
tol = 1e-9
I[np.where((I<tol) & (I>-tol))] = 0
gams.printLog(f"calculate A*invA + small number cleanup: {round(time.time() - s, 3)} sec")
s = time.time()
m.data["ident"].setRecords(I)
gams.printLog(f"convert matrix to records for I: {round(time.time() - s, 3)} sec")
s = time.time()
m.write(gams.db, ["inv_a","ident"])
gams.printLog(f"write to GamsDatabase: {round(time.time() - s, 3)} sec")
gams.printLog("")
endEmbeddedCode inv_a, ident
display ident;
In this example we extend the example shown in Reading from GamsDatabase and GMD Objects to read data from GAMS, calculate a matrix inversion, do the matrix multiplication, and then write both the A^-1
and A*A^-1
(i.e., the identity matrix) back to GAMS for display in the LST file. This data round trip highlights the benefits of using a GAMS Transfer Container (and the linked symbol structure) as the mechanism to move data – converting back and forth from a records format to a matrix format can be cumbersome, but here, GAMS Transfer takes care of all the indexing for the user.
The first few lines of GAMS code generates a 1000x1000 A
matrix as a parameter (at execution time), we then define two more parameters that we will fill with results of the embedded Python code – specifically we want to fill a parameter with the matrix A^-1
and we want to verify that another parameter (ident
) contains the identity matrix (i.e., I
). Stepping through the code:
- We start the embedded Python code section (execution time) by importing both GAMS Transfer and Numpy and by reading all the symbols that currently exist in the GamsDatabase. We must read in all this information in order to get the domain set information – GAMS Transfer needs these domain sets in order to generate matricies with the proper size.
- Generate the matrix
A
by calling.toDense()
on the symbol object in the Container. - Take the inverse of
A
withnp.linalg.inv()
. - The Parameter symbol for
inv_a
already exists in the Container, but it does not have any records (i.e.,m.data["inv_a"].records is None
will evaluate to True). We use.setRecords()
to convert theinvA
back into a records format. - We continue the computations by performing the matrix multiplication using
np.dot()
– we must clean up a lot of small numbers inI
. - The Parameter symbol for
ident
already exists in the Container, but it does not have any records. We use.setRecords()
to convertI
back into a records format. - Since we are calculating these parameter values at execution time, it is not possible to modify the domain set information (or even merge/replace it). Therefore we only want to write the parameter values to GAMS. We achieve this by writing a subset of the Container symbols out with the
m.write(gams.db, ["inv_a","ident"])
call. This partial write preserves symbol validity in the Container and it does not violate other GAMS requirements. - Finally, we can verify that the (albeit large) identity matrix exists in the LST file (or in another GDX file).
- Note
- It was not possible to just use
np.round
because small negative numbers that round to-0.0
will be interpreted by GAMS Transfer as the GAMS EPS special value.
The output for this example is shown below:
GAMS 38.1.0 Copyright (C) 1987-2022 GAMS Development. All rights reserved --- Starting compilation --- matrix.gms(52) 3 Mb --- Starting execution: elapsed 0:00:00.004 --- matrix.gms(11) 36 Mb --- Initialize embedded library libembpycclib64.dylib --- Execute embedded library libembpycclib64.dylib --- --- --- read data: 1.083 sec --- create matrix A: 0.016 sec --- calculate inv(A): 0.032 sec --- convert matrix to records for inv(A): 0.176 sec --- calculate A*invA + small number cleanup: 0.027 sec --- convert matrix to records for I: 0.17 sec --- write to GamsDatabase: 1.937 sec --- --- matrix.gms(52) 68 Mb *** Status: Normal completion
ConstContainer (Rapid Read)
In the Create a Container section we describe how to use the main object class of GAMS Transfer – the Container
. Many users of GAMS Transfer will rely on the Container
for building their data pipeline, however some users will only be interested in post-processing data from a GAMS model run. This one-directional flow of data means that these users do not need some of the advanced Container
features such as domain linking, matrix generation, domain checking, etc. The ConstContainer
(i.e., a Constant Container) object class is a data-focused read-only object that will provide a snapshot of the data target being read – the ConstContainer
can be created by reading a GDX file or a GamsDatabase/GMD object (an in memory representation of data used e.g. in embedded Python code).
Creating a ConstContainer
The ConstContainer
shares many of the same methods and attributes that are in the Container
class, which makes moving between the ConstContainer
and the Container
very simple. There are some important differences though:
- The
ConstContainer
does not link any symbol data - The
ConstContainer
can only read from one source at a time – every new call of.read()
will clear the datadict
- The
ConstContainer
constructor will not read in any symbol records – this enables users to browse an unknown data source quickly (similar behavior togdxdump
). - The
ConstContainer
does not have a.write()
method – aConstContainer
can be passed to the constructor of aContainer
which will enable data writing (however a copy of the data will be generated). - The user will never need to instantiate a symbol object and add it to the
ConstContainer
– theConstContainer
will internally generate its own set of (simplified) symbol classes and hold them in the.data
attribute.
All of these differences were inspired by users that want to read the data as fast as possible and probe unknown data files without worrying about memory issues – ConstContainer
provides users with a high level view of the data very quickly.
- ConstContainer constructor
Creating a ConstContainer
is a simple matter of initializing an object. For example:
- Note
- This new
ConstContainer
object, here calledh
, will load all the symbol data fromout.gdx
but it will not load any of the records. To load records, users must use the.read()
method.
The ConstContainer
constructor arguments are:
Argument | Type | Description | Required | Default |
---|---|---|---|---|
load_from | str or GamsDatabase /GMD Object | Points to the source of the data being read into the ConstContainer | No | None |
system_directory | str | Absolute path to GAMS system_directory | No | Attempts to find the GAMS installation by creating a GamsWorkspace object and loading the system_directory attribute. |
The ConstContainer
contains many of the same methods that are in the Container
class, specifically:
- ConstContainer Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
describeAliases | create a summary table with descriptive statistics for Aliases | symbols=None (None ,str ,list ) - if None , assumes all aliases | pandas.DataFrame |
describeParameters | create a summary table with descriptive statistics for Parameters | symbols=None (None ,str ,list ) - if None , assumes all parameters | pandas.DataFrame |
describEquations | create a summary table with descriptive statistics for Equations | symbols=None (None ,str ,list ) - if None , assumes all equations | pandas.DataFrame |
describeSets | create a summary table with descriptive statistics for Sets | symbols=None (None ,str ,list ) - if None , assumes all sets | pandas.DataFrame |
describeVariables | create a summary table with descriptive statistics for Variables | symbols=None (None ,str ,list ) - if None , assumes all variables | pandas.DataFrame |
listAliases | list all aliases | - | list |
listEquations | list all equations | types=None (list of equation types) - if None , assumes all types | list |
listParameters | list all parameters | - | list |
listSets | list all sets | - | list |
listSymbols | list all symbols | - | list |
listVariables | list all variables | types=None (list of variable types) - if None , assumes all types | list |
read | main method to read load_from , can be provided with a list of symbols to read in subsets, records controls if symbol records are loaded or just metadata | load_from (str ,GMD Object Handle,GamsDatabase Object) symbols="all" (str , list ) records=True (bool ) | None |
The structure of the DataFrames that are returned from the describe*
methods mirrors that in the Container
; the user should reference Describing Data for detailed descriptions of the columns.
ConstContainer Symbol Objects
The ConstContainer
uses a simplified symbol class structure to hold symbol specific information. The user will never need to directly instantiate these symbol classes (called SimpleSet
, SimpleParameter
, SimpleVariable
, SimpleEquation
and SimpleAlias
); these symbol classes are nested under the ConstContainer
class. This class structure is used to provide the feel of a read-only object.
While users do not need to instantiate any of the Simple*
symbol objects directly, they are available for users to probe. Many of the same Container
symbol methods that generate summary statistics exist for the ConstContainer
symbols. Specifically:
- SimpleSet Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
is_singleton | bool if symbol is a singleton set | bool | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
summary | output a dict of only the metadata | dict | - |
- SimpleSet Methods
None
- SimpleParameter Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
is_scalar | True if the len(self.domain) = 0 | bool | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
summary | output a dict of only the metadata | dict | - |
- SimpleParameter Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS across all columns | - | int |
countNA | total number of SpecialValues.NA across all columns | - | int |
countNegInf | total number of SpecialValues.NEGINF across all columns | - | int |
countPosInf | total number of SpecialValues.POSINF across all columns | - | int |
countUndef | total number of SpecialValues.UNDEF across all columns | - | int |
findEps | find index positions of SpecialValues.EPS in value column | - | pandas.Index |
findNA | find index positions of SpecialValues.NA in value column | - | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in value column | - | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in value column | - | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in value column | - | pandas.Index |
getMaxValue | get the maximum value across all columns | - | float |
getMinValue | get the minimum value across all columns | - | float |
getMeanValue | get the mean value across all columns | - | float |
getMaxAbsValue | get the maximum absolute value across all columns | - | float |
whereMax | find the domain entry of records with a maximum value (return first instance only) | - | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | - | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | - | list of str |
- SimpleVariable Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
summary | output a dict of only the metadata | dict | - |
type | str type of variable | dict | - |
- SimpleVariable Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS across all columns | columns="level" (str ,list ) | int |
countNA | total number of SpecialValues.NA across all columns | columns="level" (str ,list ) | int |
countNegInf | total number of SpecialValues.NEGINF across all columns | columns="level" (str ,list ) | int |
countPosInf | total number of SpecialValues.POSINF across all columns | columns="level" (str ,list ) | int |
countUndef | total number of SpecialValues.UNDEF across all columns | columns="level" (str ,list ) | int |
findEps | find index positions of SpecialValues.EPS in column | column="level" (str ) | pandas.Index |
findNA | find index positions of SpecialValues.NA in column | column="level" (str ) | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in column | column="level" (str ) | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in column | column="level" (str ) | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in column | column="level" (str ) | pandas.Index |
getMaxValue | get the maximum value across all columns | columns="level" (str ,list ) | float |
getMinValue | get the minimum value across all columns | columns="level" (str ,list ) | float |
getMeanValue | get the mean value across all columns | columns="level" (str ,list ) | float |
getMaxAbsValue | get the maximum absolute value across all columns | columns="level" (str ,list ) | float |
whereMax | find the domain entry of records with a maximum value (return first instance only) | column="level" (str ) | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | column="level" (str ) | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | column="level" (str ) | list of str |
- SimpleEquation Properties
Property | Description | Type | Special Setter Behavior |
---|---|---|---|
description | description of symbol | str | - |
dimension | dimension of symbol | int | setting is a shorthand notation to create ["*"] * n domains in symbol |
domain_forwarding | flag that forces set elements to be recursively included in all parent sets (i.e., implicit set growth) | bool | no effect after records have been set |
domain_labels | column headings for the records DataFrame | list of str | - |
domain_names | string version of domain names | list of str | - |
domain_type | none , relaxed or regular depending on state of domain links | str | - |
name | name of symbol | str | sets the GAMS name of the symbol |
number_records | number of symbol records (i.e., returns len(self.records) if not None ) | int | - |
records | the main symbol records | pandas.DataFrame | responsive to domain_forwarding state |
summary | output a dict of only the metadata | dict | - |
type | str type of variable | dict | - |
- SimpleEquation Methods
Method | Description | Arguments/Defaults | Returns |
---|---|---|---|
countEps | total number of SpecialValues.EPS across all columns | columns="level" (str ,list ) | int |
countNA | total number of SpecialValues.NA across all columns | columns="level" (str ,list ) | int |
countNegInf | total number of SpecialValues.NEGINF across all columns | columns="level" (str ,list ) | int |
countPosInf | total number of SpecialValues.POSINF across all columns | columns="level" (str ,list ) | int |
countUndef | total number of SpecialValues.UNDEF across all columns | columns="level" (str ,list ) | int |
findEps | find index positions of SpecialValues.EPS in column | column="level" (str ) | pandas.Index |
findNA | find index positions of SpecialValues.NA in column | column="level" (str ) | pandas.Index |
findNegInf | find index positions of SpecialValues.NEGINF in column | column="level" (str ) | pandas.Index |
findPosInf | find index positions of SpecialValues.POSINF in column | column="level" (str ) | pandas.Index |
findUndef | find index positions of SpecialValues.Undef in column | column="level" (str ) | pandas.Index |
getMaxValue | get the maximum value across all columns | columns="level" (str ,list ) | float |
getMinValue | get the minimum value across all columns | columns="level" (str ,list ) | float |
getMeanValue | get the mean value across all columns | columns="level" (str ,list ) | float |
getMaxAbsValue | get the maximum absolute value across all columns | columns="level" (str ,list ) | float |
whereMax | find the domain entry of records with a maximum value (return first instance only) | column="level" (str ) | list of str |
whereMaxAbs | find the domain entry of records with a maximum absolute value (return first instance only) | column="level" (str ) | list of str |
whereMin | find the domain entry of records with a minimum value (return first instance only) | column="level" (str ) | list of str |
- Example (reading only meta data w/ ConstContainer constructor)
Note that in this example we make use of the convenience notation contained in the constructor to read in only the metadata of the trnsport.gdx
file. This allows users to quickly explore the symbols contained in a file (or in-memory object) and it also explains why there are many None
values in the columns of the .describeParameters()
method.
- Example (reading all data w/ ConstContainer.read() method)
In this example we make use of the .read()
method to retrieve both the metadata and the numerical records for all symbols in the GDX file – the .describeParameters()
method will now populate the DataFrame with additional summary statistics.