5  Loading data

5.1 Overview

The ems_data() function loads and prepares GTAP database files for use in CGE model runs. It handles the three core GTAP data files (dat, par, and set) and can optionally convert between GTAP v6.2 and v7.0 formats. This function is also used to input any time steps (for temporally dynamic models), load set mappings for sets that are read into the model (i.e., not constructed from set operations), and incorporate auxiliary data via ems_aux().

ems_data(dat_input, par_input, set_input, time_steps = NULL,
         aux_input = NULL, target_format = NULL, ...)
Argument Type Description
dat_input Character Path to a HAR file containing basedata coefficient data
par_input Character Path to a HAR file containing parameter coefficient data
set_input Character Path to a HAR file containing set elements and attributes
time_steps Integer vector or NULL Time steps for intertemporal models (see Time steps)
aux_input List or NULL Auxiliary data created with ems_aux() (see Auxiliary inputs)
target_format Character or NULL Target data format conversion: "GTAPv6" or "GTAPv7"
... Named arguments Set mappings for read-in sets (see Loading and specifying sets)

All three input files (dat_input, par_input, set_input) and model-specific set mappings are required. time_steps is required for intertemporal models.

5.2 Input files

Compatible input files are currently limited to those produced by the Global Trade Analysis Project (GTAP). Although the most recent database releases are proprietary, GTAP has consistently open-accessed databases two versions out (GTAP 10 database as of the current GTAP 12 release).

In order to access the freely available database, users will need to register with GTAP and download the “FlexAgg” format. For GTAP 10 this is accessible under the “GDyn Data Base” entry in the GTAP 10 “Satellite Data and Utilities” section here. For paying GTAP members, the current teems version is capable of handling the “FlexAgg” format for GTAP Databases 10 and 11.

The following command will load data consistent with the standard GTAP model, using broad aggregations for regions, commodities, and endowments. Users will need to specify the location of data ("path/to/flexAgg11c17/gsdfdat.har" placeholder).

v7_data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har", # basedata coefficients
                    par_input = "path/to/flexAgg11c17/gsdfpar.har", # parameter coefficients
                    set_input = "path/to/flexAgg11c17/gsdfset.har", # set elements
                    REG = "big3",
                    COMM = "macro_sector",
                    ACTS = "macro_sector",
                    ENDW = "labor_diff"
)

Note that the actual names of HAR data files vary according the GTAP release.

Input Type GTAP v9 GTAP v10 GTAP v11
dat_input gddat.har gsddat.har gsdfdat.har
par_input gdpar.har gsdpar.har gsdfpar.har
set_input gdset.har gsdset.har gsdfset.har

5.3 Set mappings

Set mappings aggregate the fine-grained regions, sectors, and endowments in the GTAP database down to the resolution required for a given model run. The set names used in ... depend on the data format — see the Loading and specifying sets chapter for the full list of internal mappings and instructions for supplying custom CSV mappings.

5.4 Auxiliary inputs

With more complex models it is often necessary to load auxiliary data or directly modify existing database headers. The ems_aux() function prepares auxiliary inputs for injection via the aux_input argument. See the dedicated Auxiliary data chapter for full documentation, including input formats, type = "set" for new set creation, replacement vs. novel header behaviour, and error/warning reference.

5.5 Time steps

For temporally dynamic models (e.g., GTAP-INT, GTAP-RE), time steps must be provided representing t0 plus actual year steps from t0. Time steps can be inputted in either actual year increments or represented as chronological years. Note that if chronological years are used, t0 must correspond with the reference year of the database being used.

Explicit time steps (equivalent to c(2014, 2015, 2016, 2017, 2018, 2020, 2022, 2024, 2026, 2028, 2030)) due to reference year associated with the loaded data inputs.

data <- ems_data(dat_input = "path/to/flexagg10AY14/gsddat.har",
                 par_input = "path/to/flexagg10AY14/gsdpar.har",
                 set_input = "path/to/flexagg10AY14/gsdset.har",
                 REG = "WB23",
                 TRAD_COMM = "services",
                 ENDW_COMM = "labor_agg",
                 time_steps = c(0, 1, 2, 3, 4, 6, 8, 10, 12, 14, 16)
)

Chronological time steps (note reference year of input data)

data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har",
                 par_input = "path/to/flexAgg11c17/gsdfpar.har",
                 set_input = "path/to/flexAgg11c17/gsdfset.har",
                 REG = "R32",
                 COMM = "medium",
                 ACTS = "medium",
                 ENDW = "labor_diff",
                 time_steps = c(2017, 2018, 2020, 2022, 2024, 2026, 2028, 2030)
)

5.6 Converting formats

GTAP databases are available in the classic v6.2 and standard v7.0 formats, corresponding to the classic and standard GTAP models. If you wish to use a classic-based model with newer GTAP databases or vice-versa, target_format will convert the underlying database. Note that set mappings are to be specified according to the model to be used, not the original format of inputted data.

GTAP 11 database converted to v6.2 data format:

v6_data <- ems_data(dat_input = "path/to/flexAgg11c17/gsdfdat.har",
                    par_input = "path/to/flexAgg11c17/gsdfpar.har",
                    set_input = "path/to/flexAgg11c17/gsdfset.har",
                    REG = "big3",
                    TRAD_COMM = "macro_sector",
                    ENDW_COMM = "labor_agg",
                    target_format = "GTAPv6"
)

GTAP 10 v6.2 database converted to v7.0 data format:

v7_data <- ems_data(dat_input = "path/to/flexagg10AY14/gsddat.har",
                    par_input = "path/to/flexagg10AY14/gsdpar.har",
                    set_input = "path/to/flexagg10AY14/gsdset.har",
                    REG = "AR5",
                    COMM = "macro_sector",
                    ACTS = "macro_sector",
                    ENDW = "labor_agg",
                    target_format = "GTAPv7"
)

5.7 Data aggregation

Data is aggregated according to type, with non-parameter coefficients summed by target set mappings and weighted averages calculated for the parameters below. A simple mean is applied to parameters not listed below. If custom parameter values are desired, these can be loaded in their final format into the ems_model() function, and no aggregation or modification will take place. In the below notation, hat (ˆ) indicates the new parameter value and asterisk indicates the destination set mapping.

5.7.1 GTAP v6.2 format parameter weight methodology

Headers, descriptions, and index ranges for parameters and associated data within the GTAP v6.2 model. GTAP-INT weights are identical to GTAP v6.2 with the addition of an invariant time set.

Header Description Set Index
Param.
\(\sigma_\text{d}\) ESBD Armington CES for dom./imp. allocation \(i \in \text{TRAD\_COMM}\)
\(\sigma_\text{m}\) ESBM Armington CES for regional allocation of imports \(i \in \text{TRAD\_COMM}\)
\(\sigma_\text{va}\) ESBV CES between primary factors in production \(j \in \text{PROD\_COMM}\)
\(\gamma\) INCP CDE expansion parameter \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)
\(\beta\) SUBP CDE substitution parameter \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)
Data
EVFA Endowments – Firms purchases at agents’ prices \(i \in \text{ENDW\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\)
VDFA Inter. – Firms’ dom. purchases at agents’ prices \(i \in \text{TRAD\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\)
VDGA Inter. – Government dom. purchases at agents’ \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)
VDPA Inter. – Household dom. purchases at agents’ \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)
VIFA Inter. – Firms’ imp. at agents’ prices \(i \in \text{TRAD\_COMM},\ j \in \text{PROD\_COMM},\ r \in \text{REG}\)
VIGA Inter. – Government imp. at agents’ prices \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)
VIPA Inter. – Household imp. at agents’ prices \(i \in \text{TRAD\_COMM},\ r \in \text{REG}\)

5.7.1.1 Aggregation methods

\[ \hat{\sigma}_{\text{d}_i} = \frac{ \sum_{i}^{i^*} \left( \sigma_{\text{d}_i} \sum_r (vdpa_{i,r} + vdga_{i,r} + vdfa_{i,r} + vipa_{i,r} + viga_{i,r} + vifa_{i,r}) \right) } { \sum_{i,r}^{i^*} (vdpa_{i,r} + vdga_{i,r} + vdfa_{i,r} + vipa_{i,r} + viga_{i,r} + vifa_{i,r}) } \tag{5.1}\]

\[ \hat{\sigma}_{\text{m}_i} = \frac{ \sum_{i}^{i^*} \left( \sigma_{\text{m}_i} \sum_r (vipa_{i,r} + viga_{i,r} + vifa_{i,r}) \right) } { \sum_{i,r}^{i^*} (vipa_{i,r} + viga_{i,r} + vifa_{i,r}) } \tag{5.2}\]

\[ \hat{\sigma}_{\text{va}_j} = \begin{cases} \displaystyle \frac{ \sum_{j}^{j^*} \left( \sigma_{\text{va}_j} \sum_r evfa_{j,r} \right) }{ \sum_{j,r}^{j^*} evfa_{j,r} } & \text{if } j \neq cgds \\ \sigma_{\text{va}_j} & \text{if } j = cgds \end{cases} \tag{5.3}\]

\[ \hat{\gamma}_{i,r} = \frac{ \sum_{i,r}^{i^* r^*} \gamma_{i,r} \sum_{i,r} (vdpa_{i,r} + vipa_{i,r}) }{ \sum_{i,r}^{i^* r^*} (vdpa_{i,r} + vipa_{i,r}) } \tag{5.4}\]

\[ \hat{\beta}_{i,r} = \frac{ \sum_{i,r}^{i^* r^*} \left( \beta_{i,r} \sum_{i,r} (vdpa_{i,r} + vipa_{i,r}) \right) }{ \sum_{i,r}^{i^* r^*} (vdpa_{i,r} + vipa_{i,r}) } \tag{5.5}\]

5.7.2 GTAP v7.0 format parameter weight methodology

Headers, descriptions, and index ranges for parameters and associated data within the GTAP v7 model. GTAP-RE weights are identical to GTAP v7.0 with the addition of an invariant time set.

Header Description Set Index
Param.
\(\sigma_\text{d}\) ESBD Armington CES for domestic/imported allocation \(c \in \text{COMM}\)
\(\sigma_\text{m}\) ESBM Armington CES for regional allocation of imports \(c \in \text{COMM}\)
\(\sigma_\text{va}\) ESBV CES between primary factors in production \(a \in \text{ACTS},\ r \in \text{REG}\)
\(\gamma\) INCP CDE expansion parameter \(c \in \text{COMM},\ r \in \text{REG}\)
\(\beta\) SUBP CDE substitution parameter \(c \in \text{COMM},\ r \in \text{REG}\)
Data
EVFP Primary factor purchases at purchasers’ prices \(e \in \text{ENDW},\ a \in \text{ACTS},\ r \in \text{REG}\)
VDFP Domestic purchases by firms at purchasers’ prices \(c \in \text{COMM},\ a \in \text{ACTS},\ r \in \text{REG}\)
VDGP Domestic purchases by government at purchasers’ prices \(c \in \text{COMM},\ r \in \text{REG}\)
VDPP Domestic purchases by households at purchasers’ prices \(c \in \text{COMM},\ r \in \text{REG}\)
VMFP Import purchases by firms at purchasers’ prices \(c \in \text{COMM},\ a \in \text{ACTS},\ r \in \text{REG}\)
VMGP Import purchases by government at purchasers’ prices \(c \in \text{COMM},\ r \in \text{REG}\)
VMPP Import purchases by households at purchasers’ prices \(c \in \text{COMM},\ r \in \text{REG}\)

5.7.2.1 Aggregation methods

\[ \hat{\sigma}_{\text{d}_{c}} = \frac{\sum_{c}^{c^{*}}\left(\sigma_{\text{d}_{c}}\sum_{r} (vdpp_{c,r} + vmpp_{c,r} + vdgp_{c,r} + vmgp_{c,r} + vdfp_{c,r} + vmfp_{c,r})\right)} {\sum_{c,r}^{c^{*}} (vdpp_{c,r} + vmpp_{c,r} + vdgp_{c,r} + vmgp_{c,r} + vdfp_{c,r} + vmfp_{c,r})} \tag{5.6}\]

\[ \hat{\sigma}_{\text{m}_{c}} = \frac{\sum_{c}^{c^{*}}\left(\sigma_{\text{m}_{c}}\sum_{r} (vmpp_{c,r} + vmgp_{c,r} + vmfp_{c,r})\right)} {\sum_{c,r}^{c^{*}} (vmpp_{c,r} + vmgp_{c,r} + vmfp_{c,r})} \tag{5.7}\]

\[ \hat{\sigma}_{\text{va}_{a}} = \frac{\sum_{a}^{a^{*}}\left(\sigma_{\text{va}_{a}}\sum_{r} evfp_{a,r}\right)} {\sum_{a,r}^{a^{*}} evfp_{a,r}} \tag{5.8}\]

\[ \hat{\gamma}_{c,r} = \frac{\sum_{c,r}^{c^{*}r^{*}}\gamma_{c,r}\sum_{c,r} (vdpp_{c,r} + vmpp_{c,r})} {\sum_{c,r}^{c^{*}r^{*}} (vdpp_{c,r} + vmpp_{c,r})} \tag{5.9}\]

\[ \hat{\beta}_{c,r} = \frac{\sum_{c,r}^{c^{*}r^{*}}\left(\beta_{c,r}\sum_{c,r} (vdpp_{c,r} + vmpp_{c,r})\right)} {\sum_{c,r}^{c^{*}r^{*}} (vdpp_{c,r} + vmpp_{c,r})} \tag{5.10}\]