How to produce bioenergy supply price curves for REMIND 3.0 using MAgPIE 5.0 (MAgPIE Emulator)

Introduction

In many REMIND mitigation scenarios bioenergy plays a major role in decarbonizing the energy system (e.g. Klein et al. 2013). Thus, the biomass potential and its costs are crucial factors that affect overall mitigation costs (Klein et al. 2013, Rose et al. 2013). In REMIND there are two types of biomass, residues and purpose-grown biomass. Residues are available at a constant price of 1 $/GJ and have a (linearly increasing) maximal global potential of less than 55 EJ/yr. The resource of the more relevant purpose-grown biomass has no explicit upper limit. Its potential and supply price are represented by regional biomass supply curves that provide the bioenergy price as a function of bioenergy demand (Klein et al. 2014). REMIND calculates the costs by integrating the price function. This tutorial describes how to derive these bioenergy supply curves from the global landuse model MAgPIE.

Since the supply curves reflect MAgPIE’s price response to the demand for bioenergy they are called the MAgPIE emulator. Actually, the MAgPIE emulator comprises further MAgPIE data in REMIND representing the landuse system, especially landuse emissions from agriculture and bioenergy production, and agricultural production costs. In contrast to the bioenergy supply curves, which are part of the optimization in REMIND, these further MAgPIE data are given by fixed exogenous trajectories available for a couple of SSP-RCP combinations. A description can be found here. This tutorial focuses on the biomass supply curves only.

By default REMIND uses the aforementioned data (bioenergy price and emissions) that was originally derived form MAgPIE. In most cases and if not explicitly requested, REMIND runs uncoupled to MAgPIE (standalone mode). Therefore there is no feedback from the MAgPIE model to the bioenergy demand and carbon price from REMIND. However, bioenergy prices in MAgPIE are sensitive to the carbon price (Klein et al. 2014). Also, SSP assumptions affect bioenergy prices (e.g. a SSP1 scenario tends to show lower bioenergy prices than a SSP5 scenario). Thus, it would not be appropriate to use a single supply curve in REMIND for different RCPs and SSPs. For this reason, there are different supply curves implemented in REMIND (derived from MAgPIE applying different carbon prices and SSPs), one of which can be chosen according to the the actual SSP-RCP scenario.

To reach a higher consistency between REMIND and MAgPIE regarding bioenergy prices (and landuse emissions), REMIND can also be run in coupled mode iteratively with MAgPIE. In this case MAgPIE runs subsequently to REMIND using the bioenergy demand (and the carbon price) to calculate actual bioenergy prices (and landuse emissions). In the next iteration REMIND uses this price response to update its supply curves by scaling them upwards or downwards (multiplicative adjustment). The iteration is repeated until the bioenergy market is in equilibrium, i.e. until demand and price do not change across iterations. Note that in both cases (stand-alone and coupled) REMIND uses the bioenergy supply curves, which are updated by MAgPIE only in the latter case.

If developments in the MAgPIE model affect bioenergy prices (e.g. by integrating afforestation or by modifying crucial input data such as food demand) it is necessary to derive new bioenergy supply curves (as described below) in order to (i) reflect these changes in REMIND standalone runs or (ii) to provide a better starting point for the convergence in coupled runs.

The method

The bioenergy supply price curves are derived by measuring the price response of the MAgPIE model to 73 different global bioenergy demand scenarios. Each global bioenergy demand scenario yields a time path of bioenergy production and bioenergy prices for all MAgPIE regions. For each region and time step the supply curve is fitted to the resulting 73 combinations of bioenergy production and bioenergy prices.

Methods: Constructing the supply curves
Methods: Constructing the supply curves

Main steps

  • Optional: Prepare input data for MAgPIE (MATLAB on your computer): 73 bioenergy demand scenarios, GHG price scenarios (e.g. No tax, tax 30, …). This step can be skipped for bioenergy demand if you want to use the standard set of input data that is automatically downloaded by the emulator script mentioned below.
  • Start MAgPIE runs on the cluster using the emulator start script that is shipped with MAgPIE. For each GHG price scenario 73 MAgPIE runs with different bioenergy demands are started.
  • Calculate the supply curves using the emulator function of this R package.
  • Feed the resulting files into mrremind to make it part of the REMIND intput data.

These steps are explained in detail below.

1. Perform MAgPIE runs

1.1 Start emulator runs

  • Clone MAgPIE on the cluster
    • git clone https://github.com/magpiemodel/magpie
    • Checkout the appropriate branch (most likely the develop branch, sometimes the rc (=release candidate))
  • In config/scenario_config_emulator.csv characterize the scenarios and set start to 1 for those you want to get supplycurves for (e.g. SSP2). Each line in config/scenario_config_emulator.csv defines one scenario. For each scenario the start script (see below) will launch 73 MAgPIE runs applying 73 different bioenergy demand trajectories. The characteristics of a scenario are defined in the columns of config/scenario_config_emulator.csv, most importantly the MAgPIE basic configuration (mag_scen) and the RCP represented by a carbon price trajectory (co2tax_2025). In detail:
    • title: can be neglected, the name of the output folders will be generated automatically based on the scenario settings
    • start: defines if this scenario will be run (1) or not (0)
    • mag_scen: a combination of built-in pre-defined scenario settings available in MAgPIE, such as SSP, NDC, …
    • co2tax_name: the name that will be used to name the output
    • co2tax_2025: the name of a REMIND mif file containing the GHG prices that will be applied for this scenario. IMPORTANT: You need to copy the mif files to the MAgPIE main folder before starting the MAgPIE runs!. E.g. a file like REMIND_generic_C_SSP5-NDC-rem-5.mif.
  • Start emulator runs using Rscript start.R -> 9: extra -> 2: emulator in the MAgPIE’s mainfolder

1.2 Calculate bioenergy supply curves

The supply curves are generated automatically by one of the output scripts available in MAgPIE. It is located here scripts/output/extra/emulator.R and uses the emulator function of this package. For more details please see the vigentte("remulator") and ?emulator.

  • After all 73 MAgPIE runs belonging to a scenario are fished the emulator should already be available for this scenario, since the last of the 73 runs triggers the emulator script for the respective scenario.
  • The emulator script creates a new folder output/emulator, containing individual subfolders for each scenario from the scenario_config file.
  • The supply curves are linear fits through the 73 points in the biomass price / quantity sphere for each time step and each region. The fits are represented by coefficients a (y-intercept) and b (slope).
  • Within the scenario folders the fit coefficients are stored in a file (that will go into REMIND) like f30_bioen_price_SSP2-26_690d3718e151be1b450b394c1064b1c5.cs4r and visualized like in scatter-fit-allyears-fixed-SSP2-NDC-PkBudg900.png:
Picture: supply curves
Picture: supply curves
  • If no supply curves are available, you can start the emulator script manually using
    • Rscript output.R.
    • -> select the number of only ONE of any of the runs that belong to the scenario you want to produce the emulators for. The other runs belonging to this scenario will be identified automatically.
    • -> 6: extra -> 4: emulator
  • In case you need to debug find the underlying script here scripts/output/extra/emulator.R

1.3 Correct emulators if necessary (replace flat fits)

You will probably notice that for some regions (usually regions with low biomass potential, often CAZ or MEA) the algorithm was not able to find a good linear fit in every time step. This often happens when the slope of the fit would need to be infinite (or at least very high), i.e. when there is no biomass potential, irrespective of the price that is being paid. In this case, the fit is (often) a flat line with slope zero, or close to zero (e.g. scatter-fit-SDP-NDC-NDC-CAZ.png):

Picture: flat fits
Picture: flat fits

For each scenario these flat fits need to be adjusted manually, such that they are either overwritten by values from subsequent years (for example using 2055 values also for 2050), or simply removed (i.e. that there is no bioenergy potential at all in the respective year for the respective region). This is done with the replace_flat_fits.R script which is part of this package. Here is what you do:

  • Use the code example from below to replace the flat fits. Place the code into a script in the output/emulator folder, adapt it to your needs, and run it via Rscript.
library(remulator)

# Provide regions and years of fit coefficients you want to be replaced.
# replace_flat_fits then automatically copies coefficients from subsequent years
# or, if there is no reasonable fit for any year, set all values to NA.
# Then mrremind will interpret this region to be not compatible at any time.

flat <- c("CAZ:2070",
          paste0("CAZ:", paste0(seq(2000, 2060, 5), collapse = ",")),
          paste0("IND:", paste0(seq(2000, 2040, 5), collapse = ",")),
          "IND:2060",
          "IND:2070",
          "IND:2080",
          paste0("MEA:", paste0(seq(2000, 2035, 5), collapse = ",")),
          paste0("OAS:", paste0(seq(2000, 2030, 5), collapse = ",")),
          paste0("SSA:", paste0(seq(2000, 2025, 5), collapse = ","))
)

# Run the function and let it plot the supply curves 
# into the subfolder "replaced-flat" of the respective fit

replace_flat_fits("SSP2-NDC-NDC/linear/data_postfit_SSP2-NDC-NDC.Rdata",
                  flat = flat,
                  plot = TRUE)
  • Check for each region, in which time steps the slope of the fit functions is too small by comparing it to the fits of other regions and years. Attention: in the regional .png files (e.g. scatter-fit-… -CAZ.png) it is sometimes difficult to see if slopes are really too small, since the scaling of the x- and y-axes are not fixed, so better double check with the plot for the respective year (e.g. scatter-fit-…-y2045.png), here the scaling is fixed and the slope can be compared with other regions.
  • Collect all the regions and years for which you want to replace the fits in a vector of strings (named flat in the code example above) of the form "LAM:2020,2050" and pass it to the replace_flat_fits function. For all of the listed regions and time steps the function copies fits from subsequent years. It creates a subfolder “replaced_flat” within the existing fit folder (usually “linear”), where now all fits should be corrected. You can check the new scatter-fit-allyears-fixed-….png if all flat fits were replaced.
  • If there are no “good years” at all (happens quite often for MEA for example), list all years of this region. The replace_flat_fits function will set all fit coefficients to NA then and the calcBiomassPrice function of mrremind (see next section) will interpret this region to be not compatible at any time. This will result in emulators that don’t allow for any bioenergy production in the respective region.
  • The replace_flat_fits function also tries to extract the regionscode from the data (the regionscode is saved as an attribute of the data object in the data_postfit_....Rdata file). If no regionscode is available please add it to the name of the new file (in the replaced_flat folder) by copying the region code (e.g. _690d3718e151be1b450b394c1064b1c5 or the new shorter one) from the filename of the original fits to the end of the filename. E.g.: f30_bioen_price_SDP-NDC-PkBudg900_replaced_flat.cs4r -> f30_bioen_price_SDP-NDC-PkBudg900_replaced_flat_690d3718e151be1b450b394c1064b1c5.cs4r
  • This needs to be checked for every scenario, where flat fits were replaced

2. Add to REMIND input data

2.1 Via mrremind

This is what should be done in a production setting, when you want to publish the emulators you generated.

  • copy the files produced by the emulator script (see 1.2) or by the replace-script (see 1.3) into “sources/MAgPIE” of your input data preparation folder
  • add these files to list of files in readMAgPIE.R and if necessary add the new scenario names to the renaming list in calcBiomassPrice.R
  • run calcOutput("BiomassPrices", round=6, file="f30_bioen_price.cs4r", realization="magpie_40")
  • this produces the file output/default/f30_bioen_price.cs4r

2.2 Adding manually

If you just want to test the generated emulators in REMIND, it might be easier to manually change the relevant files. The emulator coefficients are in the file f30_bioen_price.cs4r, but to do this manually you must reproduce some of the postprocessing mrremind would do.

If you replaced flat fits, your emulator will be in (for example in a SSP2-NPI-Base scenario) SSP2-NPI-Base/replaced-flat/f30_bioen_price_SSP2-NPI-Base_replaced_flat.cs4r. The first data lines will look something like this. There’s no variable name header, but they represent year, region, scenario, coefficient, value. There are two coefficients, a and b, representing the linear fits.

2000,CAZ,SSP2-NPI-Base,a,1
2000,CHA,SSP2-NPI-Base,a,0.172909809987918
2000,EUR,SSP2-NPI-Base,a,0.141727536350427
2000,IND,SSP2-NPI-Base,a,0.166571929509765
2000,JPN,SSP2-NPI-Base,a,0.0824748054814729

Replace NAs with artificial fits

The first thing you must do is fix the NAs, replacing them with an artificial fit that prohibits biomass production for that region.

# Manually correct NAs in the cs4r since we're not going through mrremind at this time
newfilepath <- "SSP2-NPI-Base/replaced-flat/f30_bioen_price_SSP2-NPI-Base_replaced_flat.cs4r"
new <- read.magpie(newfilepath)

new[,,"a"][is.na(new[,,"a"])] <- 1
new[,,"b"][is.na(new[,,"b"])] <- 0.1

write.magpie(new, newfilepath)

Use the file in REMIND as base

The linear coefficients file in REMIND is in the path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price.cs4r in the REMIND folder. It has an additional rcp dimension, and contains data for several other scenarios, e.g.:

* description: coefficients for the bioenergy supplycurve
* unit: none
* comment: Data aggregated (toolAggregate): Thu Dec  2 19:46:36 2021
* origin: calcOutput(type = "BiomassPrices", file = "f30_bioen_price.cs4r", round = 6) (madrat 2.7.1 | mrremind 0.92.2)
* creation date: Mon Dec  6 09:33:08 2021
2000,LAM,SDP,rcp45,a,0.124899
2000,OAS,SDP,rcp45,a,0.175994
2000,SSA,SDP,rcp45,a,0.207542
2000,EUR,SDP,rcp45,a,0.074097
2000,NEU,SDP,rcp45,a,0.125951
2000,MEA,SDP,rcp45,a,0.329863
2000,REF,SDP,rcp45,a,0.100725

The header doesn’t make a difference in this case. If you want to force REMIND to use your new emulators, all you have to do is replace the scenario and rcp you will be using in this file with fits from your new emulator. First make copy of the original to be sure.

cp path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price.cs4r path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price_old.cs4r

Then in the same R script before do:

# Now we add this data to an old file with the correct structure 
oldfilepath <- "/path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price_old.cs4r"
old <- read.magpie(oldfilepath)

# Create a copy magclass object
out <- old

# Replaces in the original repeating data on all unspecified dimensions
# You have to specify one element in each dimension for the expansion to work properly
# Here we replace all `SSP2` and `SSP1` scenarios in all RCPs for illustrative purposes. 
out[,,"SSP2"] <- new
out[,,"SSP1"] <- new
# If you want to differentiate between RCPs, use magclass notation. But always one element at a time.
# out[,,"SSP2.rcp45"] <- new

# Add metadata on modification date and path, purely optional
out <- setComment(out,c(getComment(out),paste0(" modification date: ", format(Sys.time(), "%a %b %d %X %Y %Z")),paste0(" modified at: ", getwd())))

# File that REMIND will actually use
outfilepath <- "/path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price.cs4r"
write.magpie(out, outfilepath)

If you check /path/to/remind/modules/30_biomass/magpie_40/input/f30_bioen_price.cs4r, the values for all scenarios and rcps that you replaced should be the same (different between regions, years and coefficients). REMIND will decide which SSP to pick in this file from cm_LU_emi_scen and the RCP from cm_rcp_scen, so be sure that these switches match the SSPs and RCPs you replaced.