benchmark-change-settings

This vignette explains how to incorporate new outputs into the benchmarking by modifying its settings.

library(lpjmlstats) # nolint

It is a common theme in model development that new specific model outputs, not included in standard benchmarking, need to be evaluated. These new model outputs often also require new types of evaluations, that is new metrics.

For example, say we are adjusting the soil temperature routine of lpjml and want to check the effects that our modification has on the temperature of the top soil layer. Unfortunatly soil temperature is not part of the standard benchmarking. It is also not adequate to use a normal global sum to aggregate this output. Instead, a global temperature average is needed.

To meet these requirements, a new benchmarking settings list is needed. The structure of the lpjml settings list is as follows:

settings <- list(
  output_var1  = c(metric1, metric2, metric3),
  output_var2  = c(metric2),
  output_var3  = c(metric1, metric4),
  ...
)

Each line corresponds to an lpjml output variable. The left hand side defines which output is processed, while the right hand side defines how this output is processed, that is which metrics are used. More specifically, the left hand side needs to be the exact filename as given in the output directories, but without the file extension. The file extension is set to .bin.json by default and can be changed using set_lpjmlstats_settings().

All the available metrics are defined in the R/metric_subclasses.R file. We can see that the GlobAvgTimeAvgTable and GlobAvgAnnAvgTimeseries metric provide what we need, as they perform global weighted averages. Additionally, the GlobAvgTimeseries may be useful for examining the temperature amplitude within each year, while the TimeAvgMap can aid in understanding the spatial distribution.

Therefore, we define the following settings:

soil_temp_settings <- list(
  # first top layer is evaluated by all mentioned metrics
  msoiltemp1  = c(
    GlobAvgTimeAvgTable,
    GlobAvgAnnAvgTimeseries,
    GlobAvgTimeseries,
    TimeAvgMap
  ),
  # third layer is also evaulated but only by an annual time series
  msoiltemp3  = c(GlobAvgAnnAvgTimeseries)
)

These are already valid settings that could be used in the benchmark function, to have a benchmark that only evaluates soil temperature. However, we are also interested in how our adjustment effects other relevant lpjml variables, and thus we extend the default settings with this addition.

extended_default <- c(default_settings, soil_temp_settings)

Since the report focuses on soil temperature, we would like to feature a larger map showing the spatial distribution for this variable. We check our options via ?TimeAvgMap and see that this can be achieved with the highlight metric option:

metric_opt <- list(TimeAvgMap = list(highlight = c("soiltemp1")))

Now we are ready to start the benchmarking, with the new extended settings as an additional argument:

benchmark_result <-
  benchmark(
    "/p/projects/lpjml/benchmark_run_outputs/5.7.1/output", # nolint
    "/p/projects/lpjml/benchmark_run_outputs/5.8.6/output", # nolint
    settings = extended_default,
    metric_options = metric_opt,
    author = "David",
    description = "Evaluate changes to soil temperature",
    output_file = "benchmark_soiltemp.pdf"
  )

In the benchmark report you will find new sections that display the results for the new metrics. Also note the large map highlighting the time averaged soil temperature distribution at the beginning of the “Time Average” section.

The following is a more complex example where I modified the outputs, metrics and time period for a run with POEM LPJml:

library(lpjmlstats)

# --- adapt the default settings to the POEM outputs
settings <- lpjmlstats::default_settings # get the default settings list
names(settings) # check which outputs are analyzed
settings <- settings[-c(4:16,22,28)] # remove some outputs that are not available in POEM
settings <- lapply(settings, function(x) c(x[-3], TimeAvgMapWithAbs)) # remove the TimeAvgMap and add TimeAvgMapWithAbs

# --- adjust the time period
metric_names <- unique(unname(unlist(
                    lapply(settings, function(x) lapply(x, function(x) x$classname))
                ))) # get all the metric names that where used
mopt <- sapply(metric_names, 
               FUN = function(x) list(year_subset = as.character(1961:1975)), 
               simplify = FALSE) # set the year subset for all metrics

# ---- add subheadings to the maps
mopt[["TimeAvgMapWithAbs"]] <- c(mopt[["TimeAvgMapWithAbs"]], list(var_subheading = TRUE)) 

# --- run the benchmark
benchmark(list("1.1" = "/p/projects/path/to/baseline"),
          list("1.2" = "/p/projects/path/to/under_test"),
          metric_options = mopt,
          settings = settings
          )

This concludes the vignette.