5 Loading data
5.1 Overview
The ems_data() function loads and prepares GTAP database files for use in CGE model runs. It handles the three core GTAP data files (dat, par, and set) and can optionally convert between GTAP v6.2 and v7.0 formats. This function is also used to input any time steps (for temporally dynamic models), load set mappings for sets that are read into the model (i.e., not constructed from set operations), and incorporate auxiliary data via ems_aux().
ems_data(dat_input, par_input, set_input, time_steps = NULL,
aux_input = NULL, target_format = NULL, ...)| Argument | Type | Description |
|---|---|---|
dat_input |
Character | Path to a HAR file containing basedata coefficient data |
par_input |
Character | Path to a HAR file containing parameter coefficient data |
set_input |
Character | Path to a HAR file containing set elements and attributes |
time_steps |
Integer vector or NULL |
Time steps for intertemporal models (see Time steps) |
aux_input |
List or NULL |
Auxiliary data created with ems_aux() (see Auxiliary inputs) |
target_format |
Character or NULL |
Target data format conversion: "GTAPv6" or "GTAPv7" |
... |
Named arguments | Set mappings for read-in sets (see Loading and specifying sets) |
All three input files (dat_input, par_input, set_input) and model-specific set mappings are required. time_steps is required for intertemporal models.
5.2 Input files
Compatible input files are currently limited to those produced by the Global Trade Analysis Project (GTAP). Although the most recent database releases are proprietary, GTAP has consistently open-accessed databases two versions out (GTAP 10 database as of the current GTAP 12 release).
In order to access the freely available database, users will need to register with GTAP and download the “FlexAgg” format. For GTAP 10 this is accessible under the “GDyn Data Base” entry in the GTAP 10 “Satellite Data and Utilities” section here. For paying GTAP members, the current teems version is capable of handling the “FlexAgg” format for GTAP Databases 10 and 11.
The following command will load data consistent with the standard GTAP model, using broad aggregations for regions, commodities, and endowments. Users will need to specify the location of data ("path/to/flexAgg11c17/gsdfdat.har" placeholder).
v7_data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har", # basedata coefficients
par_input = "path/to/flexAgg11c17/gsdfpar.har", # parameter coefficients
set_input = "path/to/flexAgg11c17/gsdfset.har", # set elements
REG = "big3",
COMM = "macro_sector",
ACTS = "macro_sector",
ENDW = "labor_diff"
)Note that the actual names of HAR data files vary according the GTAP release.
| Input Type | GTAP v9 | GTAP v10 | GTAP v11 |
|---|---|---|---|
| dat_input | gddat.har | gsddat.har | gsdfdat.har |
| par_input | gdpar.har | gsdpar.har | gsdfpar.har |
| set_input | gdset.har | gsdset.har | gsdfset.har |
5.3 Set mappings
Set mappings aggregate the fine-grained regions, sectors, and endowments in the GTAP database down to the resolution required for a given model run. The set names used in ... depend on the data format — see the Loading and specifying sets chapter for the full list of internal mappings and instructions for supplying custom CSV mappings.
5.4 Auxiliary inputs
With more complex models it is often necessary to load auxiliary data or directly modify existing database headers. The ems_aux() function prepares auxiliary inputs for injection via the aux_input argument. See the dedicated Auxiliary data chapter for full documentation, including input formats, type = "set" for new set creation, replacement vs. novel header behaviour, and error/warning reference.
5.5 Time steps
For temporally dynamic models (e.g., GTAP-INT, GTAP-RE), time steps must be provided representing t0 plus actual year steps from t0. Time steps can be inputted in either actual year increments or represented as chronological years. Note that if chronological years are used, t0 must correspond with the reference year of the database being used.
Explicit time steps (equivalent to c(2014, 2015, 2016, 2017, 2018, 2020, 2022, 2024, 2026, 2028, 2030)) due to reference year associated with the loaded data inputs.
data <- ems_data(dat_input = "path/to/flexagg10AY14/gsddat.har",
par_input = "path/to/flexagg10AY14/gsdpar.har",
set_input = "path/to/flexagg10AY14/gsdset.har",
REG = "WB23",
TRAD_COMM = "services",
ENDW_COMM = "labor_agg",
time_steps = c(0, 1, 2, 3, 4, 6, 8, 10, 12, 14, 16)
)Chronological time steps (note reference year of input data)
data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har",
par_input = "path/to/flexAgg11c17/gsdfpar.har",
set_input = "path/to/flexAgg11c17/gsdfset.har",
REG = "R32",
COMM = "medium",
ACTS = "medium",
ENDW = "labor_diff",
time_steps = c(2017, 2018, 2020, 2022, 2024, 2026, 2028, 2030)
)5.6 Converting formats
GTAP databases are available in the classic v6.2 and standard v7.0 formats, corresponding to the classic and standard GTAP models. If you wish to use a classic-based model with newer GTAP databases or vice-versa, target_format will convert the underlying database. Note that set mappings are to be specified according to the model to be used, not the original format of inputted data.
GTAP 11 database converted to v6.2 data format:
v6_data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har",
par_input = "path/to/flexAgg11c17/gsdfpar.har",
set_input = "path/to/flexAgg11c17/gsdfset.har",
REG = "big3",
TRAD_COMM = "macro_sector",
ENDW_COMM = "labor_agg",
target_format = "GTAPv6"
)GTAP 10 v6.2 database converted to v7.0 data format:
v7_data <- ems_data(dat_input = "path/to/flexagg10AY14/gsddat.har",
par_input = "path/to/flexagg10AY14/gsdpar.har",
set_input = "path/to/flexagg10AY14/gsdset.har",
REG = "AR5",
COMM = "macro_sector",
ACTS = "macro_sector",
ENDW = "labor_agg",
target_format = "GTAPv7"
)5.7 Data aggregation
Data is aggregated according to type, with non-parameter coefficients summed by target set mappings and weighted averages calculated for the parameters below. A simple mean is applied to parameters not listed below. If custom parameter values are desired, these can be loaded in their final format into the ems_model() function, and no aggregation or modification will take place. In the below notation, hat (ˆ) indicates the new parameter value and asterisk indicates the destination set mapping.
5.7.1 GTAP v6.2 format parameter weight methodology
Headers, descriptions, and index ranges for parameters and associated data within the GTAP v6.2 model. GTAP-INT weights are identical to GTAP v6.2 with the addition of an invariant time set.
| Header | Description | Set Index | |
|---|---|---|---|
| Param. | |||
| \(\sigma_\text{d}\) | ESBD | Armington CES for dom./imp. allocation | \(i \in \text{TRAD\_COMM}\) |
| \(\sigma_\text{m}\) | ESBM | Armington CES for regional allocation of imports | \(i \in \text{TRAD\_COMM}\) |
| \(\sigma_\text{va}\) | ESBV | CES between primary factors in production | \(j \in \text{PROD\_COMM}\) |
| \(\gamma\) | INCP | CDE expansion parameter | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) |
| \(\beta\) | SUBP | CDE substitution parameter | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) |
| Data | |||
| EVFA | Endowments – Firms purchases at agents’ prices | \(i \in \text{ENDW\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\) | |
| VDFA | Inter. – Firms’ dom. purchases at agents’ prices | \(i \in \text{TRAD\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\) | |
| VDGA | Inter. – Government dom. purchases at agents’ | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) | |
| VDPA | Inter. – Household dom. purchases at agents’ | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) | |
| VIFA | Inter. – Firms’ imp. at agents’ prices | \(i \in \text{TRAD\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\) | |
| VIGA | Inter. – Government imp. at agents’ prices | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) | |
| VIPA | Inter. – Household imp. at agents’ prices | \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\) |
5.7.1.1 Aggregation methods
\[ \hat{\sigma}_{\text{d}_i} = \frac{ \sum_{i}^{i^*} \left( \sigma_{\text{d}_i} \sum_r (vdpa_{i,r} + vdga_{i,r} + vdfa_{i,r} + vipa_{i,r} + viga_{i,r} + vifa_{i,r}) \right) } { \sum_{i,r}^{i^*} (vdpa_{i,r} + vdga_{i,r} + vdfa_{i,r} + vipa_{i,r} + viga_{i,r} + vifa_{i,r}) } \tag{5.1}\]
\[ \hat{\sigma}_{\text{m}_i} = \frac{ \sum_{i}^{i^*} \left( \sigma_{\text{m}_i} \sum_r (vipa_{i,r} + viga_{i,r} + vifa_{i,r}) \right) } { \sum_{i,r}^{i^*} (vipa_{i,r} + viga_{i,r} + vifa_{i,r}) } \tag{5.2}\]
\[ \hat{\sigma}_{\text{va}_j} = \begin{cases} \displaystyle \frac{ \sum_{j}^{j^*} \left( \sigma_{\text{va}_j} \sum_r evfa_{j,r} \right) }{ \sum_{j,r}^{j^*} evfa_{j,r} } & \text{if } j \neq cgds \\ \sigma_{\text{va}_j} & \text{if } j = cgds \end{cases} \tag{5.3}\]
\[ \hat{\gamma}_{i,r} = \frac{ \sum_{i,r}^{i^* r^*} \gamma_{i,r} \sum_{i,r} (vdpa_{i,r} + vipa_{i,r}) }{ \sum_{i,r}^{i^* r^*} (vdpa_{i,r} + vipa_{i,r}) } \tag{5.4}\]
\[ \hat{\beta}_{i,r} = \frac{ \sum_{i,r}^{i^* r^*} \left( \beta_{i,r} \sum_{i,r} (vdpa_{i,r} + vipa_{i,r}) \right) }{ \sum_{i,r}^{i^* r^*} (vdpa_{i,r} + vipa_{i,r}) } \tag{5.5}\]
5.7.2 GTAP v7.0 format parameter weight methodology
Headers, descriptions, and index ranges for parameters and associated data within the GTAP v7 model. GTAP-RE weights are identical to GTAP v7.0 with the addition of an invariant time set.
| Header | Description | Set Index | |
|---|---|---|---|
| Param. | |||
| \(\sigma_\text{d}\) | ESBD | Armington CES for domestic/imported allocation | \(c \in \text{COMM}\) |
| \(\sigma_\text{m}\) | ESBM | Armington CES for regional allocation of imports | \(c \in \text{COMM}\) |
| \(\sigma_\text{va}\) | ESBV | CES between primary factors in production | \(a \in \text{ACTS},\ r \in \text{REG}\) |
| \(\gamma\) | INCP | CDE expansion parameter | \(c \in \text{COMM},\ r \in \text{REG}\) |
| \(\beta\) | SUBP | CDE substitution parameter | \(c \in \text{COMM},\ r \in \text{REG}\) |
| Data | |||
| EVFP | Primary factor purchases at purchasers’ prices | \(e \in \text{ENDW},\ a \in \text{ACTS},\ r \in \text{REG}\) | |
| VDFP | Domestic purchases by firms at purchasers’ prices | \(c \in \text{COMM},\ a \in \text{ACTS},\ r \in \text{REG}\) | |
| VDGP | Domestic purchases by government at purchasers’ prices | \(c \in \text{COMM},\ r \in \text{REG}\) | |
| VDPP | Domestic purchases by households at purchasers’ prices | \(c \in \text{COMM},\ r \in \text{REG}\) | |
| VMFP | Import purchases by firms at purchasers’ prices | \(c \in \text{COMM},\ a \in \text{ACTS},\ r \in \text{REG}\) | |
| VMGP | Import purchases by government at purchasers’ prices | \(c \in \text{COMM},\ r \in \text{REG}\) | |
| VMPP | Import purchases by households at purchasers’ prices | \(c \in \text{COMM},\ r \in \text{REG}\) |
5.7.2.1 Aggregation methods
\[ \hat{\sigma}_{\text{d}_{c}} = \frac{\sum_{c}^{c^{*}}\left(\sigma_{\text{d}_{c}}\sum_{r} (vdpp_{c,r} + vmpp_{c,r} + vdgp_{c,r} + vmgp_{c,r} + vdfp_{c,r} + vmfp_{c,r})\right)} {\sum_{c,r}^{c^{*}} (vdpp_{c,r} + vmpp_{c,r} + vdgp_{c,r} + vmgp_{c,r} + vdfp_{c,r} + vmfp_{c,r})} \tag{5.6}\]
\[ \hat{\sigma}_{\text{m}_{c}} = \frac{\sum_{c}^{c^{*}}\left(\sigma_{\text{m}_{c}}\sum_{r} (vmpp_{c,r} + vmgp_{c,r} + vmfp_{c,r})\right)} {\sum_{c,r}^{c^{*}} (vmpp_{c,r} + vmgp_{c,r} + vmfp_{c,r})} \tag{5.7}\]
\[ \hat{\sigma}_{\text{va}_{a}} = \frac{\sum_{a}^{a^{*}}\left(\sigma_{\text{va}_{a}}\sum_{r} evfp_{a,r}\right)} {\sum_{a,r}^{a^{*}} evfp_{a,r}} \tag{5.8}\]
\[ \hat{\gamma}_{c,r} = \frac{\sum_{c,r}^{c^{*}r^{*}}\gamma_{c,r}\sum_{c,r} (vdpp_{c,r} + vmpp_{c,r})} {\sum_{c,r}^{c^{*}r^{*}} (vdpp_{c,r} + vmpp_{c,r})} \tag{5.9}\]
\[ \hat{\beta}_{c,r} = \frac{\sum_{c,r}^{c^{*}r^{*}}\left(\beta_{c,r}\sum_{c,r} (vdpp_{c,r} + vmpp_{c,r})\right)} {\sum_{c,r}^{c^{*}r^{*}} (vdpp_{c,r} + vmpp_{c,r})} \tag{5.10}\]