--- title: "Calculate and plot boundaries" author: "Johanna Braun" date: "`r Sys.Date()`" output: html_document: default # pdf_document: default md_document: variant: gfm knit: (function(inputFile, encoding){ rmarkdown::render(inputFile, encoding = encoding, output_format = "all") }) vignette: > %\VignetteIndexEntry{Calculate and plot boundaries} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` **Calculate and plot boundaries** `r if (!knitr::is_latex_output()) {"✎📊"}` describes the calculation and plotting of planetary boundary statuses at different spatial scales based on LPJmL. It aims to simplify the post-processing of LPJmL output data and to make the calculations and visualizations transparent and flexible. ## Overview The calculations rely on the output of LPJmL simulations. To configure and submit LPJmL simulations, the LPJmL Runner from the lpjmlkit package can be used (https://github.com/PIK-LPJmL/lpjmlkit). In doing so, it is important to ensure that all necessary output variables are written out (at the needed spatial resolution). To do so, `boundaries` provides a function to list all needed LPJmL input and output variables (see `?list_outputs`). These have to be included in the LPJmL configuration file (`lpjml_config.cjson`). Example output files can be downloaded from https://doi.org/10.5281/zenodo.12171096. For the post-processing, two working steps are required: Calculate the boundaries (1) and plot them (2). In a seperate step, a global validation table can be created to compare the simulated key variables with global literature estimates (3). #### **1. `r if (!knitr::is_latex_output()) {"✏"}` Calculate boundaries**\ \ Use `calc_status()` to calculate the boundary status. Define which boundaries should be calculated, at which `spatial_scale` and for which `time_span`, as well as additional arguments such as the calculation `approach` to use (see `?calc_status` *for more information*). Each boundary calculation `boundary = c("bluewater", "greenwater", "lsc", "nitrogen", "biosphere")` has its own underlying calculation function, for which additional arguments can be passed to `calc_status()`, e.g. the minimum tree cover `tree_cover_thresholds` to define forest biomes in `lsc_status()`. See e.g. `?lsc_status` (boundary + "_status") for more information. The function returns a list with objects of class control_variable, referring to the status of the contol variable(s) with attributes, such as the boundary and high risk values, the unit, and the spatial resolution. ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} global_status <- calc_status( boundary = c("bluewater", "greenwater", "lsc", "nitrogen", "biosphere"), config_scenario = config_path_scenario, config_reference = config_path_reference, spatial_scale = "global", path_baseline = output_path_baseline, time_span_scenario = as.character(1901:2017), time_span_reference = as.character(1500:1699), time_span_baseline = as.character(1901:2017), time_series_avg = 1, # computation of timeseries, no moving average approach = list("bluewater" = "porkka2024", "greenwater" = "porkka2024", "nitrogen" = "schulte_uebbing2022") ) ``` \newpage #### **2. `r if (!knitr::is_latex_output()) {"📊"}` Plot boundary status(es)**\ \ The status of the control variable(s) can be visualized using the `plot_status()` function, by transferring the output of the `calc_status()` function. Depending on the spatial scale, the function applies different plotting functions (maps for `grid` and `regional` scale; timeseries plots for `global` scale). See `?plot_status()` *for more information*, as well as the individual plotting functions called by `plot_status()` for the different spatial scales (`status_map()` for `grid` and `regional` scale; `status_stylized()` (radial plot) or `status_global` (Cartesian coordinate system) for `global` scale). For some plots, the legend is not automatically plotted, but can be plotted separately using the `status_legend()` function. ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} plot_status(global_status, filename = "./global_status.png") status_legend(filename = "./legend.png") ``` #### **3. `r if (!knitr::is_latex_output()) {"📋"}` Create global validation table**\ \ The function `validate_simulation()` compares relevant key variables that impact the control variable status, such as (i) inputs to the LPJmL model (e.g. nitrogen fertilization) and (ii) processed LPJmL outputs (e.g. irrigation water consumption), as well as (iii) the control variable status values themselves with global literature estimates. The parameter settings should match those chosen for application of the `calc_status()` function at the global level, except for `time_span_scenario`, which should match the timespans/years of the literature estimates (see `./extdata/global_validation_data.csv` for an overview of included literature data). The computed results will be averaged over this timespan. See `?validate_simulation` for more information. ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} validation_table <- validate_simulation( config_scenario = config_path_scenario, config_reference = config_path_reference, path_baseline = output_path_baseline, time_span_scenario = as.character(2005:2014), time_span_reference = as.character(1500:1699), filename = "./validation_table.csv", approach = list("bluewater" = "porkka2024", "greenwater" = "porkka2024", "nitrogen" = "schulte_uebbing2022") ) ``` #### **Miscellaneous** \ \ More helpful functions that come with the package are: * `classify_biomes()` to determine spatially-explicit biome extents based on the foliar projected cover of different plant functional types simulated in LPJmL, the annual mean temperature and elevation (inputs to LPJmL). This function is used, amongst others, to calculate the land system change boundary status but can also be used independently. * `plot_biomes()` to visualize the spatially-explicit biome extents calculated with `classify_biomes`. * `calc_efrs()` to calculate environmental flow requirements based on different defined approaches based on LPJmL discharge output data. This is used for the blue water boundary status at the gridded level but can be relevant also for other applications. * `as_risk_level()` to convert the status of a control variable to a risk level based on the boundary and high risk values. If the risk level is not to be plotted but needed for further analysis, this function can be directly applied to the output of `calc_status`. For an overview on all implemented approaches, the respective control variables, units, the default boundary & high risk values, as well as all needed LPJmL outputs, see `./extdata/metric_files.yml`.     \newpage ## Usage ```{r setup, echo = TRUE, eval = FALSE, highlight = TRUE} library(boundaries) # nolint:undesirable_function_linter # set paths to lpjml configuration files and outputs path_configurations <- "./lpjml/configurations/" config_path_scenario <- paste0(path_configurations, "config_scenario.json") config_path_reference <- paste0(path_configurations, "config_reference.json") path_outputs <- "./lpjml/output/" path_outputs_pnv <- paste0(path_outputs, "pnv/") plotpath <- paste0("./R/plots/") ```     ### Global status timeseries\ #### **Example** *Global status calculation with moving average*   ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} ### calc status at the global level # define analysis and reference period time_span_scenario <- as.character(1901:2017) time_span_reference <- as.character(1500:1699) # define the number of years to average over, for a moving average; if set to # 1, no moving average is calculated nyear_window <- 5 global_status <- calc_status( # define boundaries to calculate - it can also be only one boundary boundary = c("bluewater", "greenwater", "lsc", "nitrogen", "biosphere"), config_scenario = config_path_scenario, config_reference = config_path_reference, spatial_scale = "global", time_span_scenario = time_span_scenario, time_span_reference = time_span_reference, # moving average over nyear_window years time_series_avg = nyear_window, approach = list("bluewater" = "porkka2024", "greenwater" = "porkka2024", "nitrogen" = "schulte_uebbing2022"), # boundary specific parameters, see individual boundary functions for details time_span_baseline = time_span_scenario, # for biosphere integrity path_baseline = paste0(path_outputs, "/pnv/"), # for biosphere integrity savanna_proxy = list(vegc = 7500) # for forest biome definition in the land system change boundary ) ### plot timeseries # There are three options for plotting the global status timeseries: # a) one timeseries panel for each boundary control variable, in a Cartesian # coordinate system # b) all timeseries plotted in one panel, in a Cartesian coordinate system # c) one timeseries panel for each boundary control variable, in a polar # coordinate system (called "stylized") # a) one timeseries panel for each boundary control variable plot_status( x = global_status, stylized = FALSE, filename = "./global_timeseries_panels.png" ) # b) all timeseries plotted in one panel plot_status( x = global_status, stylized = FALSE, all_in_one = TRUE, filename = "./global_timeseries_all_in_one.png" ) # c) one timeseries panel for each boundary control variable, in a polar # coordinate system plot_status( x = global_status, stylized = TRUE, filename = "./global_timeseries_stylized.png" ) ``` #### **Example** *Compare two calculation approaches for the same boundary*   ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} # In this example, only the bluewater boundary is calculated, for two different # approaches. The results are then jointly plotted in a Cartesian coordinate # system. # define analysis and reference period time_span_scenario <- as.character(1901:2017) time_span_reference <- as.character(1500:1699) # Aprroach A: Following Porkka et al. (2024) (https://doi.org/10.1038/s44221-024-00208-7) # Referring to global area with discharge deviations outside the pre-industrial # range bluewater_status_porkka <- calc_status( # define boundaries to calculate - it can also be only one boundary boundary = c("bluewater"), config_scenario = config_path_scenario, config_reference = config_path_reference, spatial_scale = "global", time_span_scenario = time_span_scenario, time_span_reference = time_span_reference, # no moving average, by setting the number of years to average over to 1: time_series_avg = 1, approach = list("bluewater" = "porkka2024") ) # Aprroach B: Following Rockström et al. (2009) (https://doi.org/10.1038/461472a) # Referring to global bluewater consumption, but adapting the defaullt boundary # and high risk values bluewater_status_rockstroem <- calc_status( # define boundaries to calculate - it can also be only one boundary boundary = c("bluewater"), config_scenario = config_path_scenario, config_reference = config_path_reference, spatial_scale = "global", time_span_scenario = time_span_scenario, time_span_reference = time_span_reference, # no moving average, by setting the number of years to average over to 1: time_series_avg = 1, approach = list("bluewater" = "rockstroem2009"), # change default planetary boundary and high risk values, following Gerten # et al. 2013 (https://doi.org/10.1016/j.cosust.2013.11.001), # all referring to km3/yr as defined in `metric_files.yml` thresholds = list("bluewater" = list(holocene = 0, pb = 2800, high_risk = 4000)) ) # plot both results in one timeseries panel # merge both results into one list # TODO test if this is working! bluewater_status <- list("bluewater" = bluewater_status_porkka$bluewater, "bluewater" = bluewater_status_rockstroem$bluewater) plot_status( x = bluewater_status, stylized = FALSE, all_in_one = TRUE, filename = "./global_bluewater_status_porkka_vs_rockstroem.png" ) ``` ### Gridded status\ #### **Example** *Gridded status calculation and plotting*   ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} # define analysis and reference period time_span_scenario <- as.character(2008:2017) time_span_reference <- as.character(1500:1699) # if the land use effect is to be isolated, the time span for the reference # period can be set to the same as the scenario period: # time_span_reference <- as.character(2008:2017) #nolint # This way, climate change induced changes are excluded. # calc status at the gridded level gridded_status <- calc_status( boundary = c("lsc", "nitrogen", "bluewater", "biosphere"), config_scenario = config_path_scenario, config_reference = config_path_reference, time_span_scenario = time_span_scenario, time_span_reference = time_span_reference, spatial_scale = "grid", # set time_series_avg to NULL, to not calculate a timeseries, but to average # over the entire scenario time span time_series_avg = NULL, # boundary specific parameters, see individual boundary functions for details time_span_baseline = time_span_scenario, # for biosphere integrity path_baseline = paste0(path_outputs, "/pnv/"), # for biosphere integrity savanna_proxy = list(vegc = 7500) # for forest biome definition in the land system change boundary ) # plot status at the gridded level, for details see `?status_maps` # There are two options for plotting a map with the gridded status(es): # a) plot the control variable status of each boundary (e.g. deforestation share # for the land system change boundary) # b) plot the risk level of each boundary, based on a normalized color scale # and the boundary and high risk values # a) control variable status plot_status( x = gridded_status, filename = "./gridded_status_control_variable.png", grid_path = paste0(path_outputs_pnv, "grid.bin.json"), risk_level = FALSE ) # b) risk level plot_status( x = gridded_status, filename = "./gridded_status_risk_level.png", grid_path = paste0(path_outputs_pnv, "grid.bin.json"), risk_level = TRUE ) ```       \newpage ### Regional status\ For the regional status, the calculation is performed on a aggregated regional level that makes sense for the respective boundary. For example, for the land system change boundary, the regional status is calculated at the level of forest biomes, for biosphere integrity at the level of biomes, and for bluewater at the level of river basins. #### **Example** *Regional status calculation and plotting*   ```{r, echo = TRUE, eval = FALSE, highlight = TRUE} # define analysis and reference period time_span_scenario <- as.character(2008:2017) time_span_reference <- as.character(1500:1699) # calc status at the regional level regional_status <- calc_status( boundary = c("lsc", "bluewater", "greenwater", "biosphere"), config_scenario = config_path_scenario, config_reference = config_path_scenario, time_span_scenario = time_span_scenario, time_span_reference = time_span_reference, spatial_scale = "regional", approach = list("bluewater" = "porkka2024", "greenwater" = "porkka2024"), path_baseline = paste0(path_outputs, "/pnv/"), # for biosphere integrity time_span_baseline = time_span_scenario, # for biosphere integrity savanna_proxy = list(vegc = 7500), # for forest biome definition in the land system change boundary ) # plot status at the regional level # As for the gridded status, there are two options for plotting a map with the # regional status(es): directly the control variable status or the risk level. # risk level status plot_status( x = regional_status, filename = "./regional_status_risk_level.png", grid_path = paste0(path_outputs_pnv, "grid.bin.json"), risk_level = TRUE ) ```     \newpage ### Notes & tips - Some boundary status calculations may take long (particularly for green and bluewater based on the approach following Porkka et al. 2024). To speed up the calculation, the `in_parallel` argument can be set to `TRUE` in the `calc_status` function to parallelize some of the calculations. - It is advisable to submit the calculations as a job to slurm