With this package you can fit a curve to your data. It is a wrapper
around the optim()
function from R’s stats package and
offers an easy-to-use interface to data in magclass format. Use the main
function emulator()
of this package to fit a curve of the
form you can provide. In this example we use
y = a + b * x ^c
, where x
and y
are the data you provide and a
,b
, and
c
are the coefficients the emulator()
function
will calculate using the optim()
function from R’s stats
package. The `emulator()´ function generates a separate fit for each
region and each year. The resulting curves together with the raw data
will be plotted and saved to figures and compiled in a pdf. See at the
bottom how to define your own functional form.
The raw input data for which the `emulator()´ function will calculate the fits needs to be provided in the magclass format. It needs to have at least three variables:
x
and y
in the
fitThe emulator()
function requires the data to be of a
particular format. There are two alternative formats allowed. The
following Examples section explains the differences and gives examples
how to apply the emulator function.
In some cases MAgPIE gets infeasible in certain years. Since the data
from this year and the years after is not meaningful the
emulator()
function offers the option to ignore these data.
To do so it needs information about which years are infeasible. Our
example data already contains this information for the region
GLO
since MAgPIE does solve all regions at once and not
individually. However, the emulator()
function performs the
fits for each year and each region individually. Therefore, we need to
transfer the global modelstatus from GLO
to all
regions.
library(remulator)
library(magclass)
emudata[,,"Modelstatus (-)"] <- emudata["GLO",,"Modelstatus (-)"]
This requires the data to have the structure as displayed below: region, year, scenario, model, variable.
str(emudata)
# Formal class 'magpie' [package "magclass"] with 1 slot
# ..@ .Data: num [1:11, 1:11, 1:438] 0.228 NA NA NA 6.439 ...
# .. ..- attr(*, "dimnames")=List of 3
# .. .. ..$ region : chr [1:11] "AFR" "CPA" "EUR" "FSU" ...
# .. .. ..$ year : chr [1:11] "y2005" "y2015" "y2025" "y2035" ...
# .. .. ..$ scenario.model.variable: chr [1:438] "CDL_base-base-1.MAgPIE.Primary Energy|Biomass|Energy Crops (EJ/yr)" "CDL_base-base-10.MAgPIE.Primary Energy|Biomass|Energy Crops (EJ/yr)" "CDL_base-base-11.MAgPIE.Primary Energy|Biomass|Energy Crops (EJ/yr)" "CDL_base-base-12.MAgPIE.Primary Energy|Biomass|Energy Crops (EJ/yr)" ...
Note: the data has no “sample” dimension (see alternative B), just scenario, model, and variable. All scenarios in your data will be lumped together and the fit will be performed on all data points from all scenarios.
When calling the emulator()
funtion you need to
provide
x
and y
in the fit, andThere are more arguments to the emulator()
function, all
of which have defaults (please see ?emulator
). All results
will be saved to output_path
(which is “emulator” by
default). Optionally you can pass further arguments to the
optim()
function, such as lower or upper bounds on the fit
coefficients. The emulator()
function performs the
following steps:
y = a + b * x ^c
, where x
and y
are the data you provide and a
,b
, and
c
are the coefficients the emulator()
function
will calculate using the optim()
function from R’s stats
packagecalc_pdf=TRUE
)If your data contains the additional dimension named “sample”, the
emulator()
function will distinguish the data by the
scenario names and perform the fit based on the data points of all
samples per scenario. Thus, there will be fits for each scenario. This
requires the data to have the following structure including a “sample”
dimension
# Formal class 'magpie' [package "magclass"] with 1 slot
# ..@ .Data: num [1:11, 1:11, 1:21462] 1.013 0.119 0.0419 0.1572 0.3047 ...
# .. ..- attr(*, "dimnames")=List of 3
# .. .. ..$ region : chr [1:11] "AFR" "CPA" "EUR" "FSU" ...
# .. .. ..$ year : chr [1:11] "y2005" "y2015" "y2025" "y2035" ...
# .. .. ..$ scenario.sample.model.variable: chr [1:21462] "CDL_base-base.1.MAgPIE.Emissions|BC|Land Use (Mt BC/yr)"
We use the data from above that has no native sample dimension and add the missing sample dimension by separating the numbers from the scenario names and putting them into the new “sample” dimension. This results in two scenarios with 73 samples each.
Magclass objects distinguish dimensions by dots in the variable names. Thus, we add a new dimension to magclass object by inserting a dot. In this case we do this by replacing the dash in the scenario name with a dot, e.g. from “-63” to “.63” (see the gsub command below).
library(magclass)
# Add dimension by replacing dashes with dots
getNames(emudata,dim="scenario") <-
gsub("-([0-9]{1,2}$)",".\\1",getNames(emudata,dim="scenario"))
# Naming the dimensions
getSets(emudata) <- c("region","year","scenario","sample","model","variable")
We call the emulator()
function the same way (except
that for our example we choose a different output directory). The
function will automatically detect the additional sample dimension and
perform all procedures for each scenario:
emulator(data=emudata,
name_x="Primary Energy|Biomass|Energy Crops (EJ/yr)",
name_y="Price|Primary Energy|Biomass (US$2005/GJ)",
name_modelstat="Modelstatus (-)",
treat_as_feasible = c(2,7),
output_path = "multi_case",
lower=c(0,0,1))
The outputs from alternative A and B are different: the output of A (single case) consists of only one scenario named “default”. Since in B the fits have been performed separately for the two scenarios the output folder contains two sub folders, one for each scenario.
By default the function that is fitted to the data points is
y = a + b * x ^c
. You can define your own function and pass
it to the emulator()
function. Please use the following
syntax to define your function: each fit coefficient is given by
param[[i]]
with i
being the ordinal number of
the coefficient (starting with 1). The definition of the default
function y = a + b * x ^c
looks as follows:
A linear fit would look like
Then pass your function to the emulator()
function.
NOTE: If you provide your own function you need to provide an initial
value for each of the fit coefficients using the parameter
initial_values
, in this case with two parameters
initial_values=c(0,1)
!
In some cases the fitting algorithm fails and generates flat fits. To
replace these flat fits with fits from other years use the
replace_flat_fits()
function. The example below shows how
to apply the function:
setwd("~/Documents/0_GIT/magpie4-model/")
library(magclass)
library(remulator)
replace_flat_fits("output/emulator/SSP2-BASE-Base/linear/data_postfit_SSP2-BASE-Base.Rdata")
replace_flat_fits("output/emulator/SSP2-26/linear/data_postfit_SSP2-26.Rdata")
replace_flat_fits("output/emulator/SSP2-Ref/linear/data_postfit_SSP2-Ref.Rdata")