Title: | Toolkit for Basic LPJmL Handling |
---|---|
Description: | A collection of basic functions to facilitate the work with the Dynamic Global Vegetation Model (DGVM) Lund-Potsdam-Jena managed Land (LPJmL) hosted at the Potsdam Institute for Climate Impact Research (PIK). It provides functions for performing LPJmL simulations, as well as reading, processing and writing model-related data such as inputs and outputs or configuration files. |
Authors: | Jannes Breier [aut, cre] , Sebastian Ostberg [aut] , Stephen Björn Wirth [aut] , Sara Minoli [aut] , Fabian Stenzel [aut] , David Hötten [aut], Christoph Müller [aut] |
Maintainer: | Jannes Breier <[email protected]> |
License: | AGPL-3 |
Version: | 1.7.3 |
Built: | 2024-10-24 06:00:31 UTC |
Source: | https://github.com/PIK-LPJmL/lpjmlkit |
A collection of basic functions to facilitate the work with the Dynamic Global Vegetation Model (DGVM) Lund-Potsdam-Jena managed Land (LPJmL) hosted at the Potsdam Institute for Climate Impact Research (PIK). It provides functions for performing LPJmL simulations, as well as reading, processing and writing model-related data such as inputs and outputs or configuration files.
Maintainer: Jannes Breier [email protected] (ORCID)
Authors:
Sebastian Ostberg [email protected] (ORCID)
Stephen Björn Wirth [email protected] (ORCID)
Sara Minoli [email protected] (ORCID)
Fabian Stenzel [email protected] (ORCID)
David Hötten [email protected]
Christoph Müller [email protected] (ORCID)
Useful links:
Report bugs at https://github.com/PIK-LPJmL/lpjmlkit/issues
Function to add a grid to an LPJmLData
object. The function acts
as a read_io()
wrapper for the grid file and adds it as an
LPJmLData
object itself to the $grid
attribute of the main object.
add_grid(x, ...)
add_grid(x, ...)
x |
LPJmLData object. |
... |
Arguments passed to |
Important:
If "file_type" == "raw"
prescribe variable = "grid"
to ensure data are
recognized as a grid.
Do not use read_io()
argument subset
here. add_grid
will use the
subset
of the parent LPJmLData
object x
.
A copy of x
(LPJmLData
object) with added $grid
attribute.
## Not run: # Read in vegetation carbon data with meta file vegc <- read_io(filename = "./vegc.bin.json") # Add grid as attribute (via grid file in output directory) vegc_with_grid <- add_grid(vegc) ## End(Not run)
## Not run: # Read in vegetation carbon data with meta file vegc <- read_io(filename = "./vegc.bin.json") # Add grid as attribute (via grid file in output directory) vegc_with_grid <- add_grid(vegc) ## End(Not run)
Function to coerce (convert) an LPJmLData
object into a pure
array. Pure - because LPJmLData stores the data already as
an array
which can be accessed via $data
.
as_array
provides additional functionality to subset or aggregate the
array
.
as_array(x, subset = NULL, aggregate = NULL, ...)
as_array(x, subset = NULL, aggregate = NULL, ...)
x |
LPJmLData object. |
subset |
List of array dimension(s) as name/key and
corresponding subset vector as value, e.g.
|
aggregate |
List of array dimension(s) as name/key and
corresponding aggregation function as value, e.g.
|
... |
Arguments passed to the aggregate function(s), e.g.
|
an array with dimensions of object $data
with
applied subset
and aggregate
functionality as well as dim
and
dimnames
from the LPJmLData
object.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns array attribute of LPJmLData object directly vegc$data # time # cell 1901-12-31 1902-12-31 1903-12-31 1904-12-31 1905-12-31 # 0 1.362730e+04 1.363163e+04 1.364153e+04 1.365467e+04 1.366689e+04 # 1 1.201350e+02 1.158988e+02 1.101675e+02 1.214204e+02 1.062658e+02 # 2 1.334261e+02 1.210387e+02 1.218128e+02 1.183210e+02 1.159934e+02 # 3 9.744530e+01 9.586801e+01 8.365642e+01 8.193731e+01 7.757602e+01 # 4 7.592700e+01 7.821202e+01 6.798551e+01 6.632317e+01 5.691082e+01 # 5 1.106748e+01 1.137272e+01 1.196524e+01 1.131316e+01 9.924266e+0 # Returns two-dimensional array with timeseries for the mean across cells # 27410:27415 as_array(vegc, subset = list(cell = 27410:27415), aggregate = list(cell = mean)) # band # time 1 # 1901-12-31 1995.959 # 1902-12-31 1979.585 # 1903-12-31 1978.054 # 1904-12-31 1935.623 # 1905-12-31 1938.805 ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns array attribute of LPJmLData object directly vegc$data # time # cell 1901-12-31 1902-12-31 1903-12-31 1904-12-31 1905-12-31 # 0 1.362730e+04 1.363163e+04 1.364153e+04 1.365467e+04 1.366689e+04 # 1 1.201350e+02 1.158988e+02 1.101675e+02 1.214204e+02 1.062658e+02 # 2 1.334261e+02 1.210387e+02 1.218128e+02 1.183210e+02 1.159934e+02 # 3 9.744530e+01 9.586801e+01 8.365642e+01 8.193731e+01 7.757602e+01 # 4 7.592700e+01 7.821202e+01 6.798551e+01 6.632317e+01 5.691082e+01 # 5 1.106748e+01 1.137272e+01 1.196524e+01 1.131316e+01 9.924266e+0 # Returns two-dimensional array with timeseries for the mean across cells # 27410:27415 as_array(vegc, subset = list(cell = 27410:27415), aggregate = list(cell = mean)) # band # time 1 # 1901-12-31 1995.959 # 1902-12-31 1979.585 # 1903-12-31 1978.054 # 1904-12-31 1935.623 # 1905-12-31 1938.805 ## End(Not run)
Function to coerce (convert) an LPJmLMetaData
object into an
LPJmL header object. More information at create_header()
.
as_header(x, silent = FALSE)
as_header(x, silent = FALSE)
x |
An LPJmLMetaData object |
silent |
Logical. Whether to suppress notifications from header conversion/initialization. |
An LPJmL header object. More information at create_header()
.
## Not run: vegc_meta <- read_meta(filename = "./vegc.bin.json") # Returns a list object with the structure of an LPJmL header as_header(vegc_meta) # $name # [1] "LPJDUMMY" # # $header # version order firstyear nyear firstcell # 4.0 4.0 1901.0 200.0 0.0 # ncell nbands cellsize_lon scalar cellsize_lat # 67420.0 1.0 0.5 1.0 0.5 # datatype nstep timestep # 3.0 1.0 1.0 # # $endian # [1] "little" ## End(Not run)
## Not run: vegc_meta <- read_meta(filename = "./vegc.bin.json") # Returns a list object with the structure of an LPJmL header as_header(vegc_meta) # $name # [1] "LPJDUMMY" # # $header # version order firstyear nyear firstcell # 4.0 4.0 1901.0 200.0 0.0 # ncell nbands cellsize_lon scalar cellsize_lat # 67420.0 1.0 0.5 1.0 0.5 # datatype nstep timestep # 3.0 1.0 1.0 # # $endian # [1] "little" ## End(Not run)
Function to coerce (convert) an LPJmLMetaData
object into a
list.
as_list(x)
as_list(x)
x |
An LPJmLMetaData object |
A list
## Not run: vegc_meta <- read_meta(filename = "./vegc.bin.json") # Returns one dimensional array with timeseries for cells `27410:27415` as_list(vegc_meta) # $sim_name # [1] "lu_cf" # # $source # [1] "LPJmL C Version 5.3.001" # # $variable # [1] "vegc" # # $descr # [1] "vegetation carbon" # # $unit # [1] "gC/m2" # # $nbands # [1] 1 # # ... ## End(Not run)
## Not run: vegc_meta <- read_meta(filename = "./vegc.bin.json") # Returns one dimensional array with timeseries for cells `27410:27415` as_list(vegc_meta) # $sim_name # [1] "lu_cf" # # $source # [1] "LPJmL C Version 5.3.001" # # $variable # [1] "vegc" # # $descr # [1] "vegetation carbon" # # $unit # [1] "gC/m2" # # $nbands # [1] 1 # # ... ## End(Not run)
Function to coerce (convert) an LPJmLData
object into a
raster or brick object that allows for any
GIS-based raster operations.
Read more about the raster package at
https://rspatial.github.io/raster/reference/raster-package.html.
The successor package of raster is called terra: https://rspatial.org/.
as_raster(x, subset = NULL, aggregate = NULL, ...)
as_raster(x, subset = NULL, aggregate = NULL, ...)
x |
LPJmLData object |
subset |
List of array dimension(s) as name/key and
corresponding subset vector as value, e.g. |
aggregate |
List of array dimension(s) as name/key and
corresponding aggregation function as value, e.g. |
... |
Arguments passed to the aggregate function(s), e.g.
|
The $grid
attribute is required for spatial transformation. When
using file_type = "meta"
, grid data are usually read automatically via
add_grid()
if the grid file is present in the same directory. Otherwise,
add_grid()
has to be called explicitly with the path to a matching grid
file. Supports either multiple bands or multiple time steps. Use subset
or
aggregate
to reduce data with multiple bands and time steps.
A raster or brick object with spatial extent
and coordinates based on internal $grid
attribute and containing a lon/lat
representation of x$data
. If multiple bands or time steps exist, a
brick is created. Further meta information such as the
lon/lat resolution are extracted from $meta
.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns a RasterBrick for all data as_raster(vegc) # class : RasterBrick # dimensions : 280, 720, 201600, 200 (nrow, ncol, ncell, nlayers) # resolution : 0.5, 0.5 (x, y) # extent : -180, 180, -56, 84 (xmin, xmax, ymin, ymax) # crs : +proj=longlat +datum=WGS84 +no_defs # source : memory # names : X1901.12.31, X1902.12.31, X1903.12.31, X1904.12.31, ... # min values : 0, 0, 0, 0, ... # max values : 28680.72, 28662.49, 28640.29, 28634.03, ... ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns a RasterBrick for all data as_raster(vegc) # class : RasterBrick # dimensions : 280, 720, 201600, 200 (nrow, ncol, ncell, nlayers) # resolution : 0.5, 0.5 (x, y) # extent : -180, 180, -56, 84 (xmin, xmax, ymin, ymax) # crs : +proj=longlat +datum=WGS84 +no_defs # source : memory # names : X1901.12.31, X1902.12.31, X1903.12.31, X1904.12.31, ... # min values : 0, 0, 0, 0, ... # max values : 28680.72, 28662.49, 28640.29, 28634.03, ... ## End(Not run)
Function to coerce (convert) an LPJmLData
object into a
rast object that allows GIS-based raster
operations. Read more about the terra package at https://rspatial.org/.
as_terra(x, subset = NULL, aggregate = NULL, ...)
as_terra(x, subset = NULL, aggregate = NULL, ...)
x |
LPJmLData object. |
subset |
List of array dimension(s) as name/key and
corresponding subset vector as value, e.g. |
aggregate |
List of array dimension(s) as name/key and
corresponding aggregation function as value, e.g. |
... |
Arguments passed to the aggregate function(s), e.g.
|
The $grid
attribute is required for spatial transformation. When
using file_type = "meta"
, grid data are usually read automatically via
add_grid()
if the grid file is present in the same directory. Otherwise,
add_grid()
has to be called explicitly with the path to a matching grid
file. Supports either multiple bands or multiple time steps. Use subset
or
aggregate
to reduce data with multiple bands and time steps.
A rast object with spatial extent and coordinates based
on internal $grid
attribute and containing a lon/lat representation of
x$data
. Further meta information such as the lon/lat resolution is
extracted from $meta
.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns a SpatRaster for all data as_terra(vegc) # ... ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns a SpatRaster for all data as_terra(vegc) # ... ## End(Not run)
Function to coerce (convert) an LPJmLData
object into a
tibble (modern data.frame). Read more about
tibbles at https://r4ds.had.co.nz/tibbles.html.
Please make sure to call lpjmlkit::as_tibble()
explicitly when also using
the tidyverse packages tibble or dplyr.
## S3 method for class 'LPJmLData' as_tibble(x, subset = NULL, aggregate = NULL, value_name = "value", ...)
## S3 method for class 'LPJmLData' as_tibble(x, subset = NULL, aggregate = NULL, value_name = "value", ...)
x |
LPJmLData object |
subset |
List of array dimension(s) as name/key and
corresponding subset vector as value, e.g.
|
aggregate |
List of array dimension(s) as name/key and
corresponding aggregation function as value, e.g.
|
value_name |
Name of value column in returned |
... |
Arguments passed to the aggregate function(s), e.g.
|
a tibble with columns corresponding to dimension
naming of the LPJmLData$data
array and values in one value column.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns two-dimensional tibble representation of vegc$data. as_tibble(vegc) # cell time band value # <fct> <fct> <fct> <dbl> # 1 0 1901-12-31 1 13627. # 2 1 1901-12-31 1 120. # 3 2 1901-12-31 1 133. # 4 3 1901-12-31 1 97.4 # 5 4 1901-12-31 1 75.9 # 6 5 1901-12-31 1 11.1 ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Returns two-dimensional tibble representation of vegc$data. as_tibble(vegc) # cell time band value # <fct> <fct> <fct> <dbl> # 1 0 1901-12-31 1 13627. # 2 1 1901-12-31 1 120. # 3 2 1901-12-31 1 133. # 4 3 1901-12-31 1 97.4 # 5 4 1901-12-31 1 75.9 # 6 5 1901-12-31 1 11.1 ## End(Not run)
Subset an array with the supplied dimnames and - if defined - replace values.
asub(x, ..., drop = TRUE)
asub(x, ..., drop = TRUE)
x |
An array with named dimensions. |
... |
One or several vectors of indices or character strings to be used
to subset |
drop |
Logical. If |
array (or vector if drop = TRUE
and only one dimension is left)
of the selected subset of x
.
my_array <- array(1, dim = c(cell = 67, month = 12, band = 3), dimnames = list(cell = 0:66, month = 1:12, band = c("band1", "band2", "band3"))) my_subset <- asub(my_array, band = c("band1", "band3")) dimnames(my_subset)[3] # $ band # [1] "band1" # [2] "band3"
my_array <- array(1, dim = c(cell = 67, month = 12, band = 3), dimnames = list(cell = 0:66, month = 1:12, band = c("band1", "band2", "band3"))) my_subset <- asub(my_array, band = c("band1", "band3")) dimnames(my_subset)[3] # $ band # [1] "band1" # [2] "band3"
Calculate the cell area of LPJmL cells based on an LPJmLData
object or latitude coordinates and grid resolution.
Uses a spherical representation of the Earth.
calc_cellarea( x, cellsize_lon = 0.5, cellsize_lat = cellsize_lon, earth_radius = 6371000.785, return_unit = "m2" )
calc_cellarea( x, cellsize_lon = 0.5, cellsize_lat = cellsize_lon, earth_radius = 6371000.785, return_unit = "m2" )
x |
|
cellsize_lon |
Grid resolution in longitude direction in degrees
(default: |
cellsize_lat |
Grid resolution in latitude direction in degrees (default:
same as |
earth_radius |
Radius of the sphere (in |
return_unit |
Character string describing the area unit of the returned
cell areas. Defaults to |
A vector or array matching the space dimension(s) of x
if x
is an
LPJmLData object. A vector of the same length as x
if x
is a vector of
latitude coordinates. Cell areas are returned in the unit return_unit
.
grid <- matrix( data = c(-179.75, 89.75, -0.25, 0.25, 0.25, -0.25, 179.75, -89.75), ncol = 2, byrow = TRUE, dimnames = list(NULL, c("lon", "lat")) ) gridarea <- calc_cellarea(grid[,"lat"])
grid <- matrix( data = c(-179.75, 89.75, -0.25, 0.25, 0.25, -0.25, 179.75, -89.75), ncol = 2, byrow = TRUE, dimnames = list(NULL, c("lon", "lat")) ) gridarea <- calc_cellarea(grid[,"lat"])
Check if created LPJmL config JSON files (write_config()
) are
valid and are ready to be used for simulations using lpjcheck for multiple
files.
check_config( x, model_path = ".", sim_path = NULL, return_output = FALSE, raise_error = FALSE, output_path = NULL )
check_config( x, model_path = ".", sim_path = NULL, return_output = FALSE, raise_error = FALSE, output_path = NULL )
x |
|
model_path |
Character string providing the path to LPJmL
(equal to |
sim_path |
Character string defining path where all simulation data are
written, including output, restart and configuration files. If |
return_output |
Parameter affecting the output. If |
raise_error |
Logical. Whether to raise an error if sub-process has
non-zero exit status. Defaults to |
output_path |
Argument is deprecated as of version 1.0; use sim_path instead. |
NULL.
## Not run: library(tibble) library(lpjmlkit) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = c(12, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) check_config(x = config_details, model_path = model_path, sim_path = sim_path, return_output = FALSE ) ## End(Not run)
## Not run: library(tibble) library(lpjmlkit) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = c(12, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) check_config(x = config_details, model_path = model_path, sim_path = sim_path, return_output = FALSE ) ## End(Not run)
Create a header from scratch in the format required by
write_header()
.
create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell, nbands = 2, cellsize_lon = 0.5, scalar = 1, cellsize_lat = cellsize_lon, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE )
create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell, nbands = 2, cellsize_lon = 0.5, scalar = 1, cellsize_lat = cellsize_lon, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE )
name |
Header name attribute (default: '"LPJGRID"). |
version |
CLM version to use (default: |
order |
Order of data items. See details below or LPJmL code for
supported values. The order may be provided either as an integer value or
as a character string (default: |
firstyear |
Start year of data in file (default: |
nyear |
Number of years of data included in file (default: |
firstcell |
Index of first data item (default: |
ncell |
Number of data items per band. |
nbands |
Number of bands per year of data (default: |
cellsize_lon |
Longitude cellsize in degrees (default: |
scalar |
Conversion factor applied to data when it is read by LPJmL or
by |
cellsize_lat |
Latitude cellsize in degrees (default: same as
|
datatype |
LPJmL data type in file. See details below or LPJmL code for
valid data type codes (default: |
nstep |
Number of time steps per year. Added in header version 4 to
separate time bands from content bands (default: |
timestep |
If larger than 1, outputs are averaged over |
endian |
Endianness to use for file (either |
verbose |
If |
File headers in input files are used by LPJmL to determine the structure of the file and how to read it. They can also be used to describe the structure of output files.
Header names usually start with "LPJ" followed by a word or abbreviation describing the type of input/output data. See LPJmL code for valid header names.
The version number determines the amount of header information included in the file. All versions save the header name and header attributes 'version', 'order', 'firstyear', 'nyear', 'firstcell', 'ncell', and 'nbands'. Header versions 2, 3 and 4 add header attributes 'cellsize_lon' and 'scalar'. Header versions 3 and 4 add header attributes 'cellsize_lat' and 'datatype'. Header version 4 adds attributes 'nstep' and 'timestep'.
Valid values for order
are 1
/ "cellyear"
, 2
/ "yearcell"
, 3
/
"cellindex"
, and 4
/ "cellseq"
. The default for LPJmL input files is
1
. The default for LPJmL output files is 4
, except for grid output
files which also use 1
.
By default, input files contain data for all cells, indicated by setting
the firstcell
index to 0
. If firstcell > 0
, LPJmL assumes the first
firstcell
cells to be missing in the data.
Valid codes for the datatype
attribute and the corresponding LPJmL data
types are: 0
/ "byte"
(LPJ_BYTE), 1
/ "short"
(LPJ_SHORT), 2
/
"int"
(LPJ_INT), 3
/ "float"
(LPJ_FLOAT), 4
/ "double"
(LPJ_DOUBLE).
The default parameters of the function are valid for grid input files using LPJ_FLOAT data type.
The function returns a list with 3 components:
name: The header name, e.g. "LPJGRID".
header: Vector of header values ('version', 'order', 'firstyear', 'nyear', 'firstcell', 'ncell', 'nbands', 'cellsize_lon', 'scalar', 'cellsize_lat', 'datatype', 'nstep', 'timestep').
endian: Endian used to write binary data, either "little" or "big".
read_header()
for reading headers from LPJmL input/output files.
write_header()
for writing headers to files.
header <- create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell = 67420, nbands = 2, cellsize_lon = 0.5, scalar = 1.0, cellsize_lat = 0.5, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE )
header <- create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell = 67420, nbands = 2, cellsize_lon = 0.5, scalar = 1.0, cellsize_lat = 0.5, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE )
This utility function tries to detect automatically if a
provided file is of "clm"
, "meta"
, or "raw"
file type. NetCDFs and
simple text formats such as ".txt" or ".csv" are also detected.
detect_io_type(filename)
detect_io_type(filename)
filename |
Character string naming the file to check. |
Character vector of length 1 giving the file type:
"cdf" for a NetCDF file (classic or NetCDF4/HDF5 format).
"clm" for a binary LPJmL input/output file with header.
"meta" for a JSON meta file describing a binary LPJmL input/output file.
"raw" for a binary LPJmL input/output file without header. This is also the default if no other file type can be recognized.
"text" for any type of text-only file, e.g. ".txt" or ".csv"
## Not run: detect_io_type(filename = "filename.clm") # [1] "clm" ## End(Not run)
## Not run: detect_io_type(filename = "filename.clm") # [1] "clm" ## End(Not run)
Function to get the dimensions of the data array of an LPJmLData object.
## S3 method for class 'LPJmLData' dim(x)
## S3 method for class 'LPJmLData' dim(x)
x |
LPJmLData object |
For the default method, either NULL
or a numeric vector, which is
coerced to integer (by truncation).
Function to get the dimnames (list) of the data array of an LPJmLData object.
## S3 method for class 'LPJmLData' dimnames(x)
## S3 method for class 'LPJmLData' dimnames(x)
x |
LPJmLData object |
A list of the same length as dim(x). Components are character vectors with positive length of the respective dimension of x.
Function to search for a file containing a specific variable in a specific directory.
find_varfile(searchdir, variable = "grid", strict = FALSE)
find_varfile(searchdir, variable = "grid", strict = FALSE)
searchdir |
Directory where to look for the variable file. |
variable |
Single character string containing the variable to search for |
strict |
Boolean. If set to |
This function looks for file names in searchdir
that match the
pattern
parameter in its list.files()
call. Files of type "meta" are
preferred. Files of type "clm" are also accepted. The function returns an
error if no suitable file or multiple files are found.
Character string with the file name of a matched file, including the full path.
This function returns the cell index from a grid file based on the provided extent or coordinates. If neither extent nor coordinates are provided, the full grid will be returned. If both extent and coordinates are provided, the function will stop and ask for only one of them. The extent should be a vector of length 4 in the form c(lonmin, lonmax, latmin, latmax). If the extent is not in the correct form, the function will swap the values to correct it.
get_cellindex( grid_filename, extent = NULL, coordinates = NULL, shape = NULL, simplify = TRUE )
get_cellindex( grid_filename, extent = NULL, coordinates = NULL, shape = NULL, simplify = TRUE )
grid_filename |
A string representing the grid file name. |
extent |
A numeric vector (lonmin, lonmax, latmin, latmax) containing the longitude and latitude boundaries between which values included in the subset. |
coordinates |
A list of two named (lon, lat) numeric vectors representing the coordinates. |
shape |
A terra SpatVector object in the WGS 84 coordinate reference system representing the shape to subset the grid. |
simplify |
A logical indicating whether to simplify the output to a vector. |
The function reads a grid file specified by grid_filename
and creates a
data frame with columns for longitude, latitude, and cell number. The cell
number is a sequence from 1 to the number of rows in the data frame.
If an extent
is provided, the function filters the cells to include only
those within the specified longitude and latitude range. The extent
should
be a numeric vector of length 4 in the form c(lonmin, lonmax, latmin, latmax).
If a list of coordinates
is provided, the function filters the cells to
include only those that match the specified coordinates. The coordinates
should be a list of two character vectors representing the longitude and
latitude values as for subset()
.
If a shape is provided as a SpatVector object, the function will return the cell index for the cells that intersect with the shape.
If more than on of extent
, coordinates
shape
are provided, the function
will stop and ask for only one of them. If neither extent
nor coordinates
nor shape
are provided, the function will return the cell numbers for all
cells in the grid.
The function also includes checks for input types and values, and gives
specific error messages for different error conditions. For example, it
checks if the grid_filename
exists, if the extent
vector has the correct
length, and if the coordinates
list contains two vectors of equal length.
Either an LPJmLData object containing the grid or a vector subsetted to the provided extent, coordinates or shape.
## Not run: get_cellindex( grid_filename = "my_grid.bin.json", extent = c(-123.25, -122.75, 49.25, 49.75) # (lonmin, lonmax, latmin, latmax) ) get_cellindex( grid_filename = "my_grid.bin.json", coordinates = list(lon = c(-123.25, -122.75), lat = c(49.25, 49.75)) ) ## End(Not run)
## Not run: get_cellindex( grid_filename = "my_grid.bin.json", extent = c(-123.25, -122.75, 49.25, 49.75) # (lonmin, lonmax, latmin, latmax) ) get_cellindex( grid_filename = "my_grid.bin.json", coordinates = list(lon = c(-123.25, -122.75), lat = c(49.25, 49.75)) ) ## End(Not run)
Provides information on the data type used in an LPJmL input/output file based on the 'datatype' attribute included in the file header.
get_datatype(header, fail = TRUE)
get_datatype(header, fail = TRUE)
header |
Header list object as returned by |
fail |
Determines whether the function should fail if the datatype is
invalid (default: |
On success, the function returns a list object with three components:
type: R data type; can be used with what
parameter of readBin()
.
size: size of data type; can be used with size
parameter of
readBin()
.
signed: whether or not the data type is signed; can be used with signed
parameter of readBin()
.
If fail = FALSE
, the function returns NULL
if an invalid datatype is
provided.
read_header()
for reading headers from LPJmL input/output files.
create_header()
for creating headers from scratch.
get_headersize()
for determining the size of file headers.
## Not run: # Read file header header <- read_header("filename.clm") # Open file for reading fp <- file("filename.clm", "rb") # Skip over file header seek(fp, get_headersize(header)) # Read in file data file_data <- readBin( fp, what = get_datatype(header)$type, size = get_datatype(header)$size, signed = get_datatype(header)$signed, n = header$header["ncell"] * header$header["nbands"] * header$header["nyear"] * header$header["nstep"], endian = header[["endian"]] ) # Close file close(fp) ## End(Not run)
## Not run: # Read file header header <- read_header("filename.clm") # Open file for reading fp <- file("filename.clm", "rb") # Skip over file header seek(fp, get_headersize(header)) # Read in file data file_data <- readBin( fp, what = get_datatype(header)$type, size = get_datatype(header)$size, signed = get_datatype(header)$signed, n = header$header["ncell"] * header$header["nbands"] * header$header["nyear"] * header$header["nstep"], endian = header[["endian"]] ) # Close file close(fp) ## End(Not run)
Convenience function to extract information from a header object
as returned by read_header()
or create_header()
. Returns one item
per call.
get_header_item(header, item)
get_header_item(header, item)
header |
LPJmL file header as returned by |
item |
Header information item to retrieve. One of |
Requested header item. Character string in case of "name" and "endian", otherwise numeric value.
create_header()
for creating headers from scratch and for a more
detailed description of the LPJmL header format.
read_header()
for reading headers from LPJmL input/output files.
## Not run: # Read file header header <- read_header("filename.clm") nyear <- get_header_item(header = header, item = "nyear") ## End(Not run)
## Not run: # Read file header header <- read_header("filename.clm") nyear <- get_header_item(header = header, item = "nyear") ## End(Not run)
Returns the size in bytes of an LPJmL file header based on a
header list object read by read_header()
or generated by
create_header()
.
get_headersize(header)
get_headersize(header)
header |
Header list object as returned by |
Integer value giving the size of the header in bytes. This can be used when seeking in the file or to calculate the expected total file size in combination with the number of included data values and the data type.
read_header()
for reading a header from an LPJmL input/output file.
create_header()
for creating a header from scratch.
## Not run: header <- read_header("filename.clm") size <- get_headersize(header) # Open file for reading fp <- file("filename.clm", "rb") # Skip over file header seek(fp, size) # Add code to read data from file ## End(Not run)
## Not run: header <- read_header("filename.clm") size <- get_headersize(header) # Open file for reading fp <- file("filename.clm", "rb") # Skip over file header seek(fp, size) # Add code to read data from file ## End(Not run)
Function to get the length of the data array of an LPJmLData object.
## S3 method for class 'LPJmLData' length(x)
## S3 method for class 'LPJmLData' length(x)
x |
LPJmLData object |
A non-negative integer or numeric (which will be rounded down).
A data container for LPJmL input and output. Container - because an
LPJmLData object is an environment in which the data array as well as the
meta data are stored after read_io()
.
The data array can be accessed via $data
, the meta data via $meta
.
The enclosing environment is locked and cannot be altered by any
other than the available modify methods to ensure its integrity and
validity.
Use base stats methods like print()
, summary.LPJmLData()
or
plot.LPJmLData()
to get insights and export methods like as_tibble()
or as_raster()
to export it into common working formats.
meta
LPJmLMetaData
object to store corresponding meta data.
data
array containing the underlying data.
grid
Optional LPJmLData
object containing the underlying grid.
add_grid()
Method to add a grid to an LPJmLData
object.
See also add_grid
LPJmLData$add_grid(...)
...
See add_grid()
.
subset()
Method to use dimension names of LPJmLData$data
array directly to subset each dimension to match the supplied vectors.
LPJmLData$subset(...)
...
transform()
Method to transform inner LPJmLData$data
array
into another space or time format.
LPJmLData$transform(...)
...
See transform()
.
as_array()
Method to coerce (convert) an LPJmLData
object into an
array.
LPJmLData$as_array(...)
...
See as_array()
.
as_tibble()
Method to coerce (convert) an LPJmLData
object into a
tibble (modern data.frame).
LPJmLData$as_tibble(...)
...
See as_tibble()
.
as_raster()
Method to coerce (convert) an LPJmLData
object into a
raster or brick object that can be used
for any GIS-based raster operations.
LPJmLData$as_raster(...)
...
See as_raster()
.
as_terra()
Method to coerce (convert) an LPJmLData
object into a
rast object that can be used for any GIS-based raster
operations.
LPJmLData$as_terra(...)
...
See as_terra()
.
plot()
Method to plot a time-series or raster map of an LPJmLData
object.
LPJmLData$plot(...)
...
See plot.LPJmLData()
.
length()
Method to get the length of the data array of an LPJmLData
object.
See also length.
LPJmLData$length()
dim()
Method to get the dimensions of the data array of an
LPJmLData
object.
See also dim.
LPJmLData$dim()
dimnames()
Method to get the dimnames (list) of the data array of an
LPJmLData
object.
LPJmLData$dimnames(...)
...
See dimnames.LPJmLData()
.
summary()
Method to get the summary of the data array of an
LPJmLData
object.
LPJmLData$summary(...)
...
See [summary.LPJmLData()]
.
print()
Method to print the LPJmLData
object.
See also print.
LPJmLData$print()
.__set_data__()
!Internal method only to be used for package development!
LPJmLData$.__set_data__(data)
data
Data array.
.__set_grid__()
!Internal method only to be used for package development!
LPJmLData$.__set_grid__(grid)
grid
An LPJmLData
object holding grid coordinates.
new()
!Internal method only to be used for package development!
LPJmLData$new(data, meta_data = NULL)
data
array
with LPJmL data.
meta_data
An LPJmLMetaData
object.
clone()
The objects of this class are cloneable with this method.
LPJmLData$clone(deep = FALSE)
deep
Whether to make a deep clone.
A dedicated data class for an LPJmL input or output grid.
LPJmLGridData serves the spatial reference for any LPJmLData objects and
matches its spatial dimensions ("cell" or "lon", "lat") when attached as an
grid attribute to it.\
LPJmLGridData holds the information which longitude and latitude correspond
to each cell center assuming WGS84 as the coordinate reference system or
the corresponding cell index when the data comes with longitude and latitude
dimension.
As in LPJmLData the data array can be accessed via $data
,
the meta data via $meta
.
lpjmlkit::LPJmLData
-> LPJmLGridData
lpjmlkit::LPJmLData$.__set_data__()
lpjmlkit::LPJmLData$.__set_grid__()
lpjmlkit::LPJmLData$as_array()
lpjmlkit::LPJmLData$as_raster()
lpjmlkit::LPJmLData$as_terra()
lpjmlkit::LPJmLData$as_tibble()
lpjmlkit::LPJmLData$dim()
lpjmlkit::LPJmLData$dimnames()
lpjmlkit::LPJmLData$length()
lpjmlkit::LPJmLData$subset()
lpjmlkit::LPJmLData$summary()
lpjmlkit::LPJmLData$transform()
add_grid()
! Not allowed to add a grid to an LPJmLGridData
object.
LPJmLGridData$add_grid(...)
...
See add_grid()
.
plot()
! No plot function available for LPJmLGridData
object. Use
as_raster()
or as_terra()
(and plot()
) to visualize the grid.
LPJmLGridData$plot(...)
...
See plot()
.
new()
!Internal method only to be used for package development!
LPJmLGridData$new(lpjml_data)
lpjml_data
LPJmLData object with variable "grid"
, "cellid"
or "LPJGRID"
print()
Method to print the LPJmLGridData
.
See also print
LPJmLGridData$print()
clone()
The objects of this class are cloneable with this method.
LPJmLGridData$clone(deep = FALSE)
deep
Whether to make a deep clone.
A meta data container for LPJmL input and output meta data.
Container - because an LPJmLMetaData
object is an environment in which
the meta data are stored after read_meta()
(or read_io()
).
Each attribute can be accessed via $<attribute>
. To get an overview over
available attributes, print
the object or export it as a list
as_list()
.
The enclosing environment is locked and cannot be altered.
sim_name
Simulation name (works as identifier in LPJmL Runner).
source
LPJmL version (character string).
history
Character string of the call used to run LPJmL. This normally includes the path to the LPJmL executable and the path to the configuration file for the simulation.
variable
Name of the input/output variable, e.g. "npp"
or
"runoff"
.
descr
Description of the input/output variable.
unit
Unit of the input/output variable.
nbands
Number (numeric) of bands (categoric dimension). Please
note that nbands
follows the convention in LPJmL, which uses the
plural form for bands as opposed to nyear
or ncell
.
band_names
Name of the bands (categoric dimension). Not included
if nbands = 1
.
nyear
Number (numeric) of data years in the parent LPJmLData
object.
firstyear
First calendar year (numeric) in the parent LPJmLData
object.
lastyear
Last calendar year (numeric) in the parent LPJmLData
object.
nstep
Number (numeric) of intra-annual time steps. 1
for annual,
12
for monthly, and 365
for daily data.
timestep
Number (numeric) of years between time steps.
timestep = 5
means that output is written every 5 years.
ncell
Number (numeric) of cells in the parent LPJmLData
object.
firstcell
First cell (numeric) in the parent LPJmLData
object.
cellsize_lon
Longitude cellsize in degrees (numeric).
cellsize_lat
Latitude cellsize in degrees (numeric).
datatype
File data type (character string), e.g. "float"
. Note
that data are converted into R-internal data type by read_io()
.
scalar
Conversion factor (numeric) applied when reading raw data
from file. The parent LPJmLData
object contains the values after
the application of the conversion factor.
order
Order of the data items in the file, either "cellyear"
,
"yearcell"
, "cellindex"
, or "cellseq"
. The structure of the data
array in the parent LPJmLData
object may differ from the original
order in the file depending on the dim_order
parameter used in
read_io()
.
offset
Offset (numeric) at the start of the binary file before the actual data start.
bigendian
(Logical) Endianness refers to the order in which bytes are stored in a multi-byte value, with big-endian storing the most significant byte at the lowest address and little-endian storing the least significant byte at the lowest address.
format
Binary format (character string) of the file containing the
actual data. Either "raw"
, "clm"
(raw with header), or "cdf"
for
NetCDF format.
filename
Name of the file containing the actual data.
subset
Logical. Whether parent LPJmLData
object is subsetted.
map
Character vector describing how to map the bands in an input
file to the bands used inside LPJmL. May be used by read_io()
to
construct a band_names
attribute.
version
Version of data file.
._data_dir_
Internal character string containing the directory from which the file was loaded.
._subset_space_
Internal logical. Whether space dimensions are
subsetted in the parent LPJmLData
object.
._fields_set_
Internal character vector of names of attributes set by the meta file.
._time_format_
Internal character string describing the time
dimension format, either "time"
or "year_month_day"
.
._space_format_
Internal character string describing the space
dimension format, either "cell"
or "lon_lat"
.
._dimension_map_
Internal dictionary/list of space and time dimension formats with categories and namings.
as_list()
Method to coerce (convert) an LPJmLMetaData
object into a
list.
See also as_list()
.
LPJmLMetaData$as_list()
as_header()
Method to coerce (convert) an LPJmLMetaData
object into an LPJmL
binary file header. More information about file headers at
create_header()
).
LPJmLMetaData$as_header(...)
...
See as_header()
.
print()
Method to print an LPJmLMetaData
object.
See also print.
LPJmLMetaData$print(all = TRUE, spaces = "")
all
Logical. Should all attributes be printed or only the most
relevant (all = FALSE
)?
spaces
Internal parameter Spaces to be printed at the start.
.__init_grid__()
!Internal method only to be used for package development!
LPJmLMetaData$.__init_grid__()
.__update_subset__()
!Internal method only to be used for package development!
LPJmLMetaData$.__update_subset__( subset, cell_dimnames = NULL, time_dimnames = NULL, year_dimnames = NULL )
subset
List of subset arguments, see also subset.LPJmLData()
.
cell_dimnames
Optional list of new cell_dimnames of subset data to update meta data. Required if spatial dimensions are subsetted.
time_dimnames
Optional list of new time_dimnames of subset data to update meta data. Required if time dimension is subsetted.
year_dimnames
Optional list of new year_dimnames of subset data to update meta data. Required if year dimension is subsetted.
.__transform_time_format__()
!Internal method only to be used for package development!
LPJmLMetaData$.__transform_time_format__(time_format)
time_format
Character. Choose between "year_month_day"
and
"time"
.
.__transform_space_format__()
!Internal method only to be used for package development!
LPJmLMetaData$.__transform_space_format__(space_format)
space_format
Character. Choose between "lon_lat"
and "cell"
.
.__set_attribute__()
!Internal method only to be used for package development!
LPJmLMetaData$.__set_attribute__(key, value)
key
Name of the attribute, e.g. "variable"
value
Value of the attribute, e.g. "grid"
new()
Create a new LPJmLMetaData object.
LPJmLMetaData$new(x, additional_attributes = list(), data_dir = NULL)
x
A list (not nested) with meta data.
additional_attributes
A list of additional attributes to be set
that are not included in file header or JSON meta file. These are
c"(band_names", "variable", "descr", "unit")
data_dir
Directory containing the file this LPJmLMetaData object refers to. Used to "lazy load" grid.
clone()
The objects of this class are cloneable with this method.
LPJmLMetaData$clone(deep = FALSE)
deep
Whether to make a deep clone.
Compiles the LPJmL source code and creates an executable by executing "make all" on the operating system shell.
make_lpjml( model_path = ".", parallel_cores = NULL, make_clean = FALSE, raise_error = TRUE, debug = NULL )
make_lpjml( model_path = ".", parallel_cores = NULL, make_clean = FALSE, raise_error = TRUE, debug = NULL )
model_path |
Character string providing the path to LPJmL
(equal to |
parallel_cores |
Numeric defining the number of available CPU cores for parallelization. |
make_clean |
Logical. If set to |
raise_error |
Logical. Whether to raise an error if sub-process has
non-zero exit status, hence if compilation fails. Defaults to |
debug |
NULL or Logical. Whether to compile LPJmL with "-debug" flag.
Defaults to |
A list with process status, see run.
## Not run: model_path <- "./LPJmL_internal" make_lpjml(model_path = model_path) ## End(Not run)
## Not run: model_path <- "./LPJmL_internal" make_lpjml(model_path = model_path) ## End(Not run)
Function to plot a time-series or raster map of an LPJmLData
object.
## S3 method for class 'LPJmLData' plot(x, subset = NULL, aggregate = NULL, raster_extent = NULL, ...)
## S3 method for class 'LPJmLData' plot(x, subset = NULL, aggregate = NULL, raster_extent = NULL, ...)
x |
LPJmLData object |
subset |
List of array dimension(s) as name/key and
corresponding subset vector as value, e.g. |
aggregate |
List of array dimension(s) as name/key and
corresponding aggregation function as value, e.g. |
raster_extent |
Optional parameter to crop map display of spatial data.
An extent or any object from which an Extent object can be
extracted. Not relevant if |
... |
Depending on the dimensions of the LPJmLData object's internal data array the plot will be a ...
single map plot: more than 8 "cell"
s or "lat"
& "lon"
dimensions
available)
multiple maps plot: length of one time (e.g."time"
, "year"
,
"month"
) or "band"
dimension > 1.
time series plot: less than 9 "cell"
s
lat/lon plot: a subsetted/aggregated "lat"
or "lon"
dimension
The plot can only handle 2-3 dimensions. Use arguments subset
and
aggregate
to modify x$data
to the desired plot type. If more than three
dimensions have length > 1,' plot will return an error and suggest to reduce
the number of dimensions.
Note that the plot function aims to provide a quick overview of the data rather than create publication-ready graphs.
No return value; called for side effects.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Plot first 9 years starting from 1901 as a raster plot plot(vegc) # Plot raster with mean over the whole time series plot(vegc, aggregate = list(time = mean)) # Plot only year 2010 as a raster plot(vegc, subset = list(time = "2010")) # Plot first 10 time steps as global mean time series. Note: Aggregation # across cells is not area-weighted. plot(vegc, subset = list(time = 1:10), aggregate = list(cell = mean)) # Plot time series for cells with LPJmL index 27410 - 27415 (C indices start # at 0 in contrast to R indices starting at 1). plot(vegc, subset = list(cell = 27411:27416)) ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Plot first 9 years starting from 1901 as a raster plot plot(vegc) # Plot raster with mean over the whole time series plot(vegc, aggregate = list(time = mean)) # Plot only year 2010 as a raster plot(vegc, subset = list(time = "2010")) # Plot first 10 time steps as global mean time series. Note: Aggregation # across cells is not area-weighted. plot(vegc, subset = list(time = 1:10), aggregate = list(cell = mean)) # Plot time series for cells with LPJmL index 27410 - 27415 (C indices start # at 0 in contrast to R indices starting at 1). plot(vegc, subset = list(cell = 27411:27416)) ## End(Not run)
Reads a configuration (config) file (compilable csjon/js file or json file) and turns it into a nested list object.
read_config(filename, from_restart = FALSE, macro = "")
read_config(filename, from_restart = FALSE, macro = "")
filename |
Character string representing path (if different from current working directory) and filename. |
from_restart |
Logical defining whether config files should be read as
from_restart (transient run) or without (spinup run). Defaults to |
macro |
Optional character string to pass one or several macros to the pre-compiler, e.g. ("-DFROM_RESTART"). Used only if file is not pre-compiled (no json). |
A nested list object representing the LPJmL configuration read from
filename
.
## Not run: config <- read_config(filename = "config_spinup.json") config[["version"]] # [1] "5.3" config[["pftpar"]][[1]][["name"]] # [1] "tropical broadleaved evergreen tree" config[["input"]][["coord"]][["name"]] # [1] "input_VERSION2/grid.bin" # visualize configuration as tree view View(config) ## End(Not run)
## Not run: config <- read_config(filename = "config_spinup.json") config[["version"]] # [1] "5.3" config[["pftpar"]][[1]][["name"]] # [1] "tropical broadleaved evergreen tree" config[["input"]][["coord"]][["name"]] # [1] "input_VERSION2/grid.bin" # visualize configuration as tree view View(config) ## End(Not run)
Generic function to read LPJmL input & output files in different formats. Depending on the format, arguments can be automatically detected or have to be passed as individual arguments.
read_grid(...)
read_grid(...)
... |
See read_io for further arguments. |
See read_io for more details.
An LPJmLGridData object.
## Not run: my_grid <- read_io("grid.bin.json") ## End(Not run)
## Not run: my_grid <- read_io("grid.bin.json") ## End(Not run)
Reads a header from an LPJmL clm file. CLM is the default format used for LPJmL input files and can also be used for output files.
read_header(filename, force_version = NULL, verbose = TRUE)
read_header(filename, force_version = NULL, verbose = TRUE)
filename |
Filename to read header from. |
force_version |
Manually set clm version. The default value |
verbose |
If |
The function returns a list with 3 components:
name: Header name, e.g. "LPJGRID"; describes the type of data in the file.
header: Vector of header values ('version', 'order', 'firstyear', 'nyear', 'firstcell', 'ncell', 'nbands', 'cellsize_lon', 'scalar', 'cellsize_lat', 'datatype', 'nstep', 'timestep') describing the file structure. If header version is <4, the header is partially filled with default values.
endian: Endianness of file ("little"
or "big"
).
create_header()
for a more detailed description of the LPJmL header
format.
write_header()
for writing headers to files.
## Not run: header <- read_header("filename.clm") ## End(Not run)
## Not run: header <- read_header("filename.clm") ## End(Not run)
Generic function to read LPJmL input & output files in different formats. Depending on the format, arguments can be automatically detected or have to be passed as individual arguments.
read_io( filename, subset = list(), band_names = NULL, dim_order = c("cell", "time", "band"), file_type = NULL, version = NULL, order = NULL, firstyear = NULL, nyear = NULL, firstcell = NULL, ncell = NULL, nbands = NULL, cellsize_lon = NULL, scalar = NULL, cellsize_lat = NULL, datatype = NULL, nstep = NULL, timestep = NULL, endian = NULL, variable = NULL, descr = NULL, unit = NULL, name = NULL, silent = FALSE )
read_io( filename, subset = list(), band_names = NULL, dim_order = c("cell", "time", "band"), file_type = NULL, version = NULL, order = NULL, firstyear = NULL, nyear = NULL, firstcell = NULL, ncell = NULL, nbands = NULL, cellsize_lon = NULL, scalar = NULL, cellsize_lat = NULL, datatype = NULL, nstep = NULL, timestep = NULL, endian = NULL, variable = NULL, descr = NULL, unit = NULL, name = NULL, silent = FALSE )
filename |
Mandatory character string giving the file name to read, including its path and extension. |
subset |
Optional list allowing to subset data read from the file along one or several of its dimensions. See details for more information. |
band_names |
Optional vector of character strings providing the band
names or |
dim_order |
Order of dimensions in returned LPJmLData object. Must be
a character vector containing all of the following in any order:
|
file_type |
Optional character string giving the file type. This is normally detected automatically but can be prescribed if automatic detection is incorrect. Valid options:
|
version |
Integer indicating the clm file header version, currently
supports one of |
order |
Integer value or character string describing the order of data
items in the file (default in input file: 1; in output file: 4). Valid
values for LPJmL input/output files are |
firstyear |
Integer providing the first year of data in the file. |
nyear |
Integer providing the number of years of data included in the
file. These are not consecutive in case of |
firstcell |
Integer providing the cell index of the first data item.
|
ncell |
Integer providing the number of data items per band. |
nbands |
Integer providing the number of bands per time step of data. |
cellsize_lon |
Numeric value providing the longitude cell size in degrees. |
scalar |
Numeric value providing a conversion factor that needs to be applied to raw data when reading it from file to derive final values. |
cellsize_lat |
Numeric value providing the latitude cell size in degrees. |
datatype |
Integer value or character string describing the LPJmL data
type stored in the file. Supported options: |
nstep |
Integer value defining the number of within-year time steps of
the file. Valid values are |
timestep |
Integer value providing the interval in years between years
represented in the file data. Normally |
endian |
Endianness to use for file (either |
variable |
Optional character string providing the name of the variable
contained in the file. Included in some JSON meta files. Important: If
|
descr |
Optional character string providing a more detailed description of the variable contained in the file. Included in some JSON meta files. |
unit |
Optional character string providing the unit of the data in the file. Included in some JSON meta files. |
name |
Optional character string specifying the header name. This is
usually read from clm headers for |
silent |
If set to |
The file_type
determines which arguments are mandatory or optional.
filename
must always be provided. file_type
is usually detected
automatically. Supply only if detected file_type
is incorrect.
In case of file_type = "meta"
, if any of the function arguments not listed
as "mandatory" are provided and are already set in the JSON file, a warning
is given, but they are still overwritten. Normally, you would only set meta
attributes not set in the JSON file.
In case of file_type = "clm"
, function arguments not listed as "optional"
are usually determined automatically from the file header included in the
clm file. Users may still provide any of these arguments to overwrite values
read from the file header, e.g. when they know that the values in the file
header are wrong. Also, clm headers with versions < 4 do not contain all
header attributes, with missing attributes filled with default values that
may not be correct for all files.
In case of file_type = "raw"
, files do not contain any information about
their structure. Users should provide all arguments not listed as "optional".
Otherwise, default values valid for LPJmL standard outputs are used for
arguments not supplied by the user. For example, the default firstyear
is
1901, the default for nyear
, nbands
, nstep
, and timestep
is 1.
subset
can be a list containing one or several named elements. Allowed
names are "band", "cell", and "year".
"year" can be used to return data for a subset of one or several years
included in the file. Integer indices can be between 1 and nyear
. If
subsetting by actual calendar years (starting at firstyear
) a
character vector has to be supplied.
"band" can be used to return data for a subset of one or several bands included in the file. These can be specified either as integer indices or as a character vector if bands are named.
"cell" can be used to return data for a subset of cells. Note that integer
indices start counting at 1, whereas character indices start counting at the
value of firstcell
(usually 0
).
An LPJmLData object.
## Not run: # First case: meta file. Reads meta information from "my_file.json" and # data from binary file linked in "my_file.json". Normally does not require # any additional arguments. my_data <- read_io("my_file.json") # Suppose that file data has two bands named "wheat" and "rice". `band_names` # are included in the JSON meta file. Select only the "wheat" band during # reading and discard the "rice" band. Also, read only data for years # 1910-1920. my_data_wheat <- read_io( "my_file.json", subset = list(band = "wheat", year = as.character(seq(1910, 1920))) ) # Read data from clm file. This includes a header describing the file # structure. my_data_clm <- read_io("my_file.clm") # Suppose that "my_file.clm" has two bands containing data for "wheat" and # "rice". Assign names to them manually since the header does not include a # `band_names` attribute. my_data_clm <- read_io("my_file.clm", band_names = c("wheat", "rice")) # Once `band_names` are set, subsetting by name is possible also for # file_type = "clm" my_data_wheat <- read_io( "my_file.clm", band_names = c("wheat", "rice"), subset = list(band = "wheat", year = as.character(seq(1910, 1920))) ) # Read data from raw binary file. All information about file structure needs # to be supplied. Use default values except for nyear (1 by default), and # nbands (also 1 by default). my_data <- read_io("my_file.bin", nyear = 100, nbands = 2) # Supply band_names to be able to subset by name my_data_wheat <- read_io( "my_file.bin", band_names = c("wheat", "rice"), # length needs to correspond to `nbands` subset = list(band = "wheat", year = as.character(seq(1910, 1920))), nyear = 100, nbands = 2, ) ## End(Not run)
## Not run: # First case: meta file. Reads meta information from "my_file.json" and # data from binary file linked in "my_file.json". Normally does not require # any additional arguments. my_data <- read_io("my_file.json") # Suppose that file data has two bands named "wheat" and "rice". `band_names` # are included in the JSON meta file. Select only the "wheat" band during # reading and discard the "rice" band. Also, read only data for years # 1910-1920. my_data_wheat <- read_io( "my_file.json", subset = list(band = "wheat", year = as.character(seq(1910, 1920))) ) # Read data from clm file. This includes a header describing the file # structure. my_data_clm <- read_io("my_file.clm") # Suppose that "my_file.clm" has two bands containing data for "wheat" and # "rice". Assign names to them manually since the header does not include a # `band_names` attribute. my_data_clm <- read_io("my_file.clm", band_names = c("wheat", "rice")) # Once `band_names` are set, subsetting by name is possible also for # file_type = "clm" my_data_wheat <- read_io( "my_file.clm", band_names = c("wheat", "rice"), subset = list(band = "wheat", year = as.character(seq(1910, 1920))) ) # Read data from raw binary file. All information about file structure needs # to be supplied. Use default values except for nyear (1 by default), and # nbands (also 1 by default). my_data <- read_io("my_file.bin", nyear = 100, nbands = 2) # Supply band_names to be able to subset by name my_data_wheat <- read_io( "my_file.bin", band_names = c("wheat", "rice"), # length needs to correspond to `nbands` subset = list(band = "wheat", year = as.character(seq(1910, 1920))), nyear = 100, nbands = 2, ) ## End(Not run)
Reads a meta JSON file or the header of a binary LPJmL input or output file.
read_meta(filename, ...)
read_meta(filename, ...)
filename |
Character string representing path (if different from current working directory) and filename. |
... |
Additional arguments passed to |
An LPJmLMetaData
object.
## Not run: meta <- read_meta(filename = "mpft_npp.bin.json") meta$sim_name # [1] "LPJmL Run" meta$firstcell # [1] 27410 meta$band_names[1] # [1] "tropical broadleaved evergreen tree" ## End(Not run)
## Not run: meta <- read_meta(filename = "mpft_npp.bin.json") meta$sim_name # [1] "LPJmL Run" meta$firstcell # [1] 27410 meta$band_names[1] # [1] "tropical broadleaved evergreen tree" ## End(Not run)
Runs LPJmL using "config_*.json"
files written by
write_config()
. write_config()
returns a tibble
that can be used as an input (see x
). It contains the details to run single
or multiple (dependent/subsequent) model runs.
run_lpjml( x, model_path = ".", sim_path = NULL, run_cmd = "srun --propagate", parallel_cores = 1, write_stdout = FALSE, raise_error = TRUE, output_path = NULL )
run_lpjml( x, model_path = ".", sim_path = NULL, run_cmd = "srun --propagate", parallel_cores = 1, write_stdout = FALSE, raise_error = TRUE, output_path = NULL )
x |
A tibble with at least one column named |
model_path |
Character string providing the path to LPJmL
(equal to |
sim_path |
Character string defining path where all simulation data are
written, including output, restart and configuration files. If |
run_cmd |
Character string defining the command used to execute lpjml (see details). Defaults to "srun –propagate" (compute ondes of old cluster at PIK). Change to "mpirun" for HPC2024 at PIK. |
parallel_cores |
Integer defining the number of available CPU
cores/nodes for parallelization. Defaults to |
write_stdout |
Logical. If |
raise_error |
Logical. Whether to raise an error if sub-process has
non-zero exit status. Defaults to |
output_path |
Argument is deprecated as of version 1.0; use sim_path instead. |
x:
A tibble for x
that has been generated by
write_config()
and can look like the following examples can
supplied:
sim_name |
scen1_spinup |
scen2_transient |
To perform subsequent or rather dependent runs the optional run parameter
"dependency"
needs to be provided within the initial
tibble supplied as param
to write_config()
.
sim_name | order | dependency |
scen1_spinup | 1 | NA |
scen2_transient | 2 | scen1 _spinup |
As a shortcut it is also possible to provide the config file
"config_*.json"
as a character string or multiple config files as a
character string vector directly as the x
argument to run_lpjml
.
Also be aware that the order of the supplied config files is important
(e.g. make sure the spin-up run is run before the transient one).
run_cmd:
The run_cmd
argument is used to define the command to execute LPJmL. This
is needed because the LPJmL executable can not directly be used on all
machines. Which command has to be used depends on the software installed.
Further information on this can be found in the INSTALL file of LPJmL.
To determine the correct command, check the lpj_submit.sh file in the bin
directory of LPJmL. Using PIK infrastrucure the command is srun
for
the hpc2015 and mpirun
for the hpc2024. To facilitate usage on the
interactive (login) nodes, no command is needed for hpc2015. For the hpc2024
the command remains mpirun
(in these cases run_lpjml
adjusts
run_cmd
accordingly).
See x
, extended by columns "type"
, "job_id"
and "status"
.
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params1 <- tibble( sim_name = c("scen1", "scen2"), startgrid = c(27410, 27410), river_routing = c(FALSE, FALSE), random_seed = c(42, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details1 <- write_config(my_params1, model_path, sim_path) run_details1 <- run_lpjml( x = config_details1, model_path = model_path, sim_path = sim_path ) run_details1 # sim_name job_id status # <chr> <int> <chr> # 1 scen1 NA run # 2 scen2 NA run # With run parameters dependency and order being set (also less other # parameters than in previous example) my_params2 <- tibble( sim_name = c("scen1", "scen2"), startgrid = c(27410, 27410), river_routing = c(FALSE, FALSE), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup") ) config_details2 <- write_config(my_params2, model_path, sim_path) run_details2 <- run_lpjml(config_details2, model_path, sim_path) run_details2 # sim_name order dependency type job_id status # <chr> <dbl> <chr> <chr> <chr> <chr> # 1 scen1_spinup 1 NA simulation NA run # 2 scen1_transient 2 scen1_spinup simulation NA run # Same but by using the pipe operator library(magrittr) run_details2 <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = as.integer(c(1, 42)), dependency = c(NA, "scen1_spinup") ) %>% write_config(model_path, sim_path) %>% run_lpjml(model_path, sim_path) # Shortcut approaches run_details3 <- run_lpjml( x = "./config_scen1_transient.json", model_path = model_path, sim_path = sim_path ) run_details4 <- run_lpjml( c("./config_scen1_spinup.json", "./config_scen1_transient.json"), model_path, sim_path ) ## End(Not run)
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params1 <- tibble( sim_name = c("scen1", "scen2"), startgrid = c(27410, 27410), river_routing = c(FALSE, FALSE), random_seed = c(42, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details1 <- write_config(my_params1, model_path, sim_path) run_details1 <- run_lpjml( x = config_details1, model_path = model_path, sim_path = sim_path ) run_details1 # sim_name job_id status # <chr> <int> <chr> # 1 scen1 NA run # 2 scen2 NA run # With run parameters dependency and order being set (also less other # parameters than in previous example) my_params2 <- tibble( sim_name = c("scen1", "scen2"), startgrid = c(27410, 27410), river_routing = c(FALSE, FALSE), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup") ) config_details2 <- write_config(my_params2, model_path, sim_path) run_details2 <- run_lpjml(config_details2, model_path, sim_path) run_details2 # sim_name order dependency type job_id status # <chr> <dbl> <chr> <chr> <chr> <chr> # 1 scen1_spinup 1 NA simulation NA run # 2 scen1_transient 2 scen1_spinup simulation NA run # Same but by using the pipe operator library(magrittr) run_details2 <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = as.integer(c(1, 42)), dependency = c(NA, "scen1_spinup") ) %>% write_config(model_path, sim_path) %>% run_lpjml(model_path, sim_path) # Shortcut approaches run_details3 <- run_lpjml( x = "./config_scen1_transient.json", model_path = model_path, sim_path = sim_path ) run_details4 <- run_lpjml( c("./config_scen1_spinup.json", "./config_scen1_transient.json"), model_path, sim_path ) ## End(Not run)
Convenience function to set information in a header object as
returned by read_header()
or create_header()
. One or several
set_header_item(header, ...)
set_header_item(header, ...)
header |
An LPJmL file header as returned by |
... |
Named header items to set. Can be one or several of 'name',
'version', 'order', 'firstyear', 'nyear', 'firstcell', 'ncell', 'nbands',
'cellsize_lon', 'scalar', 'cellsize_lat', 'datatype', 'nstep', 'timestep',
'endian'. Parameter 'verbose' can be used to control verbosity, as in
|
Header header
where header items supplied through the ellipsis
have been changed.
create_header()
for creating headers from scratch and for a more
detailed description of the LPJmL header format.
read_header()
for reading headers from files.
header <- create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell = 67420, nbands = 2, cellsize_lon = 0.5, scalar = 1.0, cellsize_lat = 0.5, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE ) header # $name # [1] "LPJGRID" # # $header # version order firstyear nyear firstcell ncell # 3.0 1.0 1901.0 1.0 0.0 67420.0 # nbands cellsize_lon scalar cellsize_lat datatype nstep # 2.0 0.5 1.0 0.5 3.0 1.0 # timestep # 1.0 # # $endian # [1] "little" # Change number of cells to 1 set_header_item(header = header, ncell = 1) # $name # [1] "LPJGRID" # # $header # version order firstyear nyear firstcell ncell # 3.0 1.0 1901.0 1.0 0.0 1.0 # nbands cellsize_lon scalar cellsize_lat datatype nstep # 2.0 0.5 1.0 0.5 3.0 1.0 # timestep # 1.0 # # $endian # [1] "little"
header <- create_header( name = "LPJGRID", version = 3, order = 1, firstyear = 1901, nyear = 1, firstcell = 0, ncell = 67420, nbands = 2, cellsize_lon = 0.5, scalar = 1.0, cellsize_lat = 0.5, datatype = 3, nstep = 1, timestep = 1, endian = .Platform$endian, verbose = TRUE ) header # $name # [1] "LPJGRID" # # $header # version order firstyear nyear firstcell ncell # 3.0 1.0 1901.0 1.0 0.0 67420.0 # nbands cellsize_lon scalar cellsize_lat datatype nstep # 2.0 0.5 1.0 0.5 3.0 1.0 # timestep # 1.0 # # $endian # [1] "little" # Change number of cells to 1 set_header_item(header = header, ncell = 1) # $name # [1] "LPJGRID" # # $header # version order firstyear nyear firstcell ncell # 3.0 1.0 1901.0 1.0 0.0 1.0 # nbands cellsize_lon scalar cellsize_lat datatype nstep # 2.0 0.5 1.0 0.5 3.0 1.0 # timestep # 1.0 # # $endian # [1] "little"
LPJmL simulations are submitted to SLURM using "config*.json"
files written
by write_config()
. write_config()
returns a
tibble that can be used as an input (see x
). It serves the details to
submit single or multiple (dependent/subsequent) model simulations.
submit_lpjml( x, model_path, sim_path = NULL, group = "", sclass = "short", ntasks = 256, wtime = "", blocking = "", constraint = "", slurm_options = list(), no_submit = FALSE, output_path = NULL )
submit_lpjml( x, model_path, sim_path = NULL, group = "", sclass = "short", ntasks = 256, wtime = "", blocking = "", constraint = "", slurm_options = list(), no_submit = FALSE, output_path = NULL )
x |
A tibble with at least one column named |
model_path |
Character string providing the path to LPJmL
(equal to |
sim_path |
Character string defining path where all simulation data are
written, including output, restart and configuration files. If |
group |
Character string defining the user group for which the job is submitted. |
sclass |
Character string defining the job classification. Available
options at PIK: |
ntasks |
Integer defining the number of tasks/threads. More information
at https://www.pik-potsdam.de/en and
https://slurm.schedmd.com/sbatch.html. Defaults to |
wtime |
Character string defining the time limit. Setting a lower time
limit than the maximum runtime for |
blocking |
Integer defining the number of cores to be blocked. More information at https://www.pik-potsdam.de/en and https://slurm.schedmd.com/sbatch.html. |
constraint |
Character string defining constraints for node selection.
Use |
slurm_options |
A named list of further arguments to be passed to sbatch.
E.g. list( |
no_submit |
Logical. Set to |
output_path |
Argument is deprecated as of version 1.0; use sim_path instead. |
A tibble for x
that has been generated by
write_config()
and can look like the following examples can
supplied:
sim_name |
scen1_spinup |
scen2_transient |
To perform subsequent or rather dependent simulations the optional run
parameter "dependency"
needs to be provided within the initial
tibble supplied as param
to write_config()
.
sim_name | dependency |
scen1_spinup | NA |
scen2_transient | scen1 _spinup |
To use different SLURM settings for each run the optional SLURM options
"sclass"
, "ntasks"
, "wtime"
, "blocking"or
constraintcan also be supplied to the initial \link[tibble]{tibble} supplied as
param to [
write_config()]. These overwrite the (default) SLURM arguments (
sclass,
ntasks,
wtime,
blockingor
constraint
)
supplied to submit_lpjml
.
sim_name | dependency | wtime |
scen1_spinup | NA | "8:00:00" |
scen2_transient | scen1 _spinup | "2:00:00" |
As a shortcut it is also possible to provide the config file
"config_*.json"
as a character string or multiple config files as a
character string vector directly as the x
argument to submit_lpjml
.
With this approach, run parameters or SLURM options cannot be taken into
account.
See x
, extended by columns "type"
, "job_id"
and "status"
.
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = as.integer(c(42, 404)), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config(my_params, model_path, sim_path) run_details <- submit_lpjml( x = config_details, model_path = model_path, sim_path = sim_path ) run_details # sim_name job_id status # <chr> <int> <chr> # 1 scen1 21235215 submitted # 2 scen2 21235216 submitted # With run parameter dependency and SLURM option wtime being # set (also less other parameters than in previous example) my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = as.integer(c(42, 404)), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "4:00:00"), ) config_details2 <- write_config(my_params2, model_path, sim_path) run_details2 <- submit_lpjml(config_details2, model_path, sim_path) run_details2 # sim_name order dependency wtime type job_id status # <chr> <dbl> <chr> <chr> <chr> <chr> <chr> # 1 scen1_spinup 1 NA 8:00:00 simulation 22910240 submitted # 2 scen1_transient 2 scen1_spinup 4:00:00 simulation 22910241 submitted # Same but by using the pipe operator library(magrittr) run_details <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = as.integer(c(1, 42)), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "4:00:00"), ) %>% write_config(model_path, sim_path) %>% submit_lpjml(model_path, sim_path) # Shortcut approach run_details <- submit_lpjml( x = "./config_scen1_transient.json", model_path = model_path, sim_path = sim_path ) run_details <- submit_lpjml( c("./config_scen1_spinup.json", "./config_scen1_transient.json"), model_path, sim_path ) ## End(Not run)
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = as.integer(c(42, 404)), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config(my_params, model_path, sim_path) run_details <- submit_lpjml( x = config_details, model_path = model_path, sim_path = sim_path ) run_details # sim_name job_id status # <chr> <int> <chr> # 1 scen1 21235215 submitted # 2 scen2 21235216 submitted # With run parameter dependency and SLURM option wtime being # set (also less other parameters than in previous example) my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = as.integer(c(42, 404)), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "4:00:00"), ) config_details2 <- write_config(my_params2, model_path, sim_path) run_details2 <- submit_lpjml(config_details2, model_path, sim_path) run_details2 # sim_name order dependency wtime type job_id status # <chr> <dbl> <chr> <chr> <chr> <chr> <chr> # 1 scen1_spinup 1 NA 8:00:00 simulation 22910240 submitted # 2 scen1_transient 2 scen1_spinup 4:00:00 simulation 22910241 submitted # Same but by using the pipe operator library(magrittr) run_details <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = as.integer(c(1, 42)), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "4:00:00"), ) %>% write_config(model_path, sim_path) %>% submit_lpjml(model_path, sim_path) # Shortcut approach run_details <- submit_lpjml( x = "./config_scen1_transient.json", model_path = model_path, sim_path = sim_path ) run_details <- submit_lpjml( c("./config_scen1_spinup.json", "./config_scen1_transient.json"), model_path, sim_path ) ## End(Not run)
Function to extract a subset of the full data in an LPJmLData
object by
applying selections along one or several of its dimensions.
## S3 method for class 'LPJmLData' subset(x, ...)
## S3 method for class 'LPJmLData' subset(x, ...)
x |
An LPJmLData object |
... |
One or several key-value combinations where keys represent the
dimension names and values represent the requested elements along these
dimensions. Subsets may either specify integer indices, e.g.
|
An LPJmLData
object with dimensions resulting from the selection
in subset
. Meta data are updated as well.
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Subset cells by index subset(vegc, cell = seq(27410, 27415)) # [...] # $data |> # dimnames() |> # .$cell "27409" "27410" "27411" "27412" "27413" "27414" # .$time "1901-12-31" "1902-12-31" "1903-12-31" "1904-12-31" ... # .$band "1" # [...] # Subset time by character vector subset(vegc, time = c("2001-12-31", "2002-12-31", "2003-12-31")) # [...] # $data |> # dimnames() |> # .$cell "0" "1" "2" "3" ... "67419" # .$time "2001-12-31" "2002-12-31" "2003-12-31" # .$band "1" # [...] ## End(Not run)
## Not run: vegc <- read_io(filename = "./vegc.bin.json") # Subset cells by index subset(vegc, cell = seq(27410, 27415)) # [...] # $data |> # dimnames() |> # .$cell "27409" "27410" "27411" "27412" "27413" "27414" # .$time "1901-12-31" "1902-12-31" "1903-12-31" "1904-12-31" ... # .$band "1" # [...] # Subset time by character vector subset(vegc, time = c("2001-12-31", "2002-12-31", "2003-12-31")) # [...] # $data |> # dimnames() |> # .$cell "0" "1" "2" "3" ... "67419" # .$time "2001-12-31" "2002-12-31" "2003-12-31" # .$band "1" # [...] ## End(Not run)
Function to get the summary of the data array of an LPJmLData object. See also summary.
## S3 method for class 'LPJmLData' summary(object, ...)
## S3 method for class 'LPJmLData' summary(object, ...)
object |
LPJmLData object |
... |
Further arguments:
|
Summary for object of class matrix (see summary) for selected dimension(s) and if defined subset.
Function to transform an LPJmLData
data object into another
space or another time format. Combinations of space and time formats are also
possible.
transform(x, to)
transform(x, to)
x |
An LPJmLData object. |
to |
A character vector defining space and/or time format into which
the corresponding data dimensions should be transformed. Choose from space
formats |
An LPJmLData
object in the selected format.
## Not run: runoff <- read_io(filename = "runoff.bin.json", subset = list(year = as.character(1991:2000))) # Transform into space format "lon_lat". This assumes a "grid.bin.json" file # is present in the same directory as "runoff.bin.json". transform(runoff, to = "lon_lat") # [...] # $data |> # dimnames() |> # .$lat "-55.75" "-55.25" "-54.75" "-54.25" ... "83.75" # .$lon "-179.75" "-179.25" "-178.75" "-178.25" ... "179.75" # .$time "1991-01-31" "1991-02-28" "1991-03-31" "1991-04-30" ... # .$band "1" # [...] # Transform time format from a single time dimension into separate dimensions # for years, months, and days. Dimensions for time steps not present in the # data are omitted, i.e. no "day" dimension for monthly data. transform(runoff, to = "year_month_day") # [...] # $data |> # dimnames() |> # .$lat "-55.75" "-55.25" "-54.75" "-54.25" ... "83.75" # .$lon "-179.75" "-179.25" "-178.75" "-178.25" ... "179.75" # .$month "1" "2" "3" "4" ... "12" # .$year "1991" "1992" "1993" "1994" ... "2000" # .$band "1" # [...] ## End(Not run)
## Not run: runoff <- read_io(filename = "runoff.bin.json", subset = list(year = as.character(1991:2000))) # Transform into space format "lon_lat". This assumes a "grid.bin.json" file # is present in the same directory as "runoff.bin.json". transform(runoff, to = "lon_lat") # [...] # $data |> # dimnames() |> # .$lat "-55.75" "-55.25" "-54.75" "-54.25" ... "83.75" # .$lon "-179.75" "-179.25" "-178.75" "-178.25" ... "179.75" # .$time "1991-01-31" "1991-02-28" "1991-03-31" "1991-04-30" ... # .$band "1" # [...] # Transform time format from a single time dimension into separate dimensions # for years, months, and days. Dimensions for time steps not present in the # data are omitted, i.e. no "day" dimension for monthly data. transform(runoff, to = "year_month_day") # [...] # $data |> # dimnames() |> # .$lat "-55.75" "-55.25" "-54.75" "-54.25" ... "83.75" # .$lon "-179.75" "-179.25" "-178.75" "-178.25" ... "179.75" # .$month "1" "2" "3" "4" ... "12" # .$year "1991" "1992" "1993" "1994" ... "2000" # .$band "1" # [...] ## End(Not run)
Requires a tibble (modern data.frame class) in a
specific format (see details & examples) to write the model configuration
file "config_*.json"
. Each row in the tibble corresponds to a model run.
The generated "config_*.json"
is based on a cjson file
(e.g. "lpjml_config.cjson"
).
write_config( x, model_path, sim_path = NULL, output_list = c(), output_list_timestep = "annual", output_format = NULL, cjson_filename = "lpjml_config.cjson", parallel_cores = 4, debug = FALSE, params = NULL, output_path = NULL, js_filename = NULL )
write_config( x, model_path, sim_path = NULL, output_list = c(), output_list_timestep = "annual", output_format = NULL, cjson_filename = "lpjml_config.cjson", parallel_cores = 4, debug = FALSE, params = NULL, output_path = NULL, js_filename = NULL )
x |
A tibble in a defined format (see details). |
model_path |
Character string providing the path to LPJmL
(equal to |
sim_path |
Character string defining path where all simulation data
are written. Also an output, a restart and a configuration folder are
created in |
output_list |
Character vector containing the |
output_list_timestep |
Single character string or character vector
defining what temporal resolution the defined outputs from |
output_format |
Character string defining the format of the output.
Defaults to |
cjson_filename |
Character string providing the name of the main LPJmL
configuration file to be parsed. Defaults to |
parallel_cores |
Integer defining the number of available CPU cores for
parallelization. Defaults to |
debug |
logical If |
params |
Argument is deprecated as of version 1.0; use x instead. |
output_path |
Argument is deprecated as of version 1.0; use sim_path instead. |
js_filename |
Argument is deprecated as of version 1.3; use cjson_filename instead. |
Supply a tibble for x
, in which each row represents
a configuration (config) for an LPJmL simulation.
Here a config refers to a precompiled "lpjml_config.cjson"
file (or file
name provided as cjson_filename
argument) which already contains all the
information from the mandatory cjson files.
The precompilation is done internally by write_config()
.write_config()
uses the column names of param
as keys for the config
json using the same syntax as lists, e.g. "k_temp"
from "param.js"
can be accessed with "param$k_temp"
or "param[["k_temp"]]"
as the column
name. (The former point-style syntax - "param.k_temp"
- is still valid but
deprecated)
For each run and thus each row, this value has to be specified in the
tibble. If the original value should instead be used, insert
NA
.
Each run can be identified via the "sim_name"
, which is mandatory to
specify.
my_params1 <- tibble( sim_name = c("scenario1", "scenario2"), random_seed = c(42, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) my_params1 # A tibble: 2 x 5 # sim_name random_seed `pftpar[[1]]$name` `param$k_temp` new_phenology # <chr> <dbl> <chr> <dbl> <lgl> # 1 scenario1 42 first_tree NA TRUE # 2 scenario2 404 NA 0.03 FALSE
To set up spin-up and transient runs, where transient runs are dependent on
the spin-up(s), a parameter "dependency"
has to be defined as a column in
the tibble that links simulations with each other using the
"sim_name"
.
Do not manually set "-DFROM_RESTART" when using "dependency"
. The same
applies for LPJmL config settings "restart", "write_restart",
"write_restart_filename", "restart_filename", which are set automatically
by this function.
This way multiple runs can be performed in succession and build a
conceivably endless chain or tree.
# With dependent runs. my_params3 <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup") ) my_params3 # A tibble: 2 x 4 # sim_name random_seed order dependency # <chr> <int> <lgl> <chr> # 1 scen1_spinup 42 FALSE NA # 2 scen1_transient 404 TRUE scen1_spinup
Another feature is to define SLURM options for each simulation (row)
separately. For example, users may want to set a lower wall clock limit
(wtime
) for the transient run than the spin-up run to get a higher priority
in the SLURM queue. This can be achieved by supplying this option as a
parameter to param
.
6 options are available, namely sclass
, ntasks
, wtime
, blocking
,
constraint
and slurm_options
. Use as arguments for [submit_lpjml()].\cr If specified in
param, they overwrite the corresponding function arguments in [
submit_lpjml()'].
my_params4 <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "2:00:00") ) my_params4 # A tibble: 2 x 5 # sim_name random_seed order dependency wtime # <chr> <int> <lgl> <chr> <chr> # 1 scen1_spinup 42 FALSE NA 8:00:00 # 2 scen1_transient 404 TRUE scen1_spinup 2:00:00
To set a macro (e.g. "MY_MACRO" or "CHECKPOINT") provide it as a column of
the tibble as you would do with a flag in the shell:
"-DMY_MACRO"
"-DCHECKPOINT"
.
Wrap macros in backticks or tibble will raise an error, as
starting an object definition with "-"
is not allowed in R.
my_params2 <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), `-DMY_MACRO` = c(TRUE, FALSE), ) my_params2 # A tibble: 2 x 3 # sim_name random_seed `-DMY_MACRO` # <chr> <int> <lgl> # 1 scen1_spinup 42 TRUE # 2 scen1_transient 404 FALSE
write_config()
creates subdirectories within the sim_path
directory
"./configurations"
to store the config files.
"./output"
to store the output within subdirectories for each
sim_name
.
"./restart"
to store the restart files within subdirectories for each
sim_name
.
The list syntax (e.g. pftpar[[1]]$name
) allows to create column names and
thus keys for accessing values in the config json.
The column "sim_name"
is mandatory (used as an identifier).
The run parameter "dependency"
is optional but enables interdependent
consecutive runs using submit_lpjml()
.
SLURM options in param
allow to use different values per run.
If NA
is specified as cell value the original value is used.
R booleans/logical constants TRUE
and FALSE
are to be used for
boolean parameters in the config json.
Value types need to be set correctly, e.g. no strings where numeric values are expected.
tibble with at least one column named "sim_name"
.
Run parameters "order"
and "dependency"
are included if defined in
x
. tibble in this format is required for
submit_lpjml()
.
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = c(12, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 1 # sim_name # <chr> # 1 scen1 # 2 scen2 # Usage with dependency my_params <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup") ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 3 # sim_name order dependency # <chr> <dbl> <chr> # 1 scen1_spinup 1 NA # 2 scen1_transient 2 scen1_spinup my_params <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "2:00:00") ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 4 # sim_name order dependency wtime # <chr> <dbl> <chr> <chr> # 1 scen1_spinup 1 NA 8:00:00 # 2 scen1_transient 2 scen1_spinup 2:00:00 ## End(Not run)
## Not run: library(tibble) model_path <- "./LPJmL_internal" sim_path <-"./my_runs" # Basic usage my_params <- tibble( sim_name = c("scen1", "scen2"), random_seed = c(12, 404), `pftpar[[1]]$name` = c("first_tree", NA), `param$k_temp` = c(NA, 0.03), new_phenology = c(TRUE, FALSE) ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 1 # sim_name # <chr> # 1 scen1 # 2 scen2 # Usage with dependency my_params <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup") ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 3 # sim_name order dependency # <chr> <dbl> <chr> # 1 scen1_spinup 1 NA # 2 scen1_transient 2 scen1_spinup my_params <- tibble( sim_name = c("scen1_spinup", "scen1_transient"), random_seed = c(42, 404), dependency = c(NA, "scen1_spinup"), wtime = c("8:00:00", "2:00:00") ) config_details <- write_config( x = my_params, model_path = model_path, sim_path = sim_path ) config_details # A tibble: 2 x 4 # sim_name order dependency wtime # <chr> <dbl> <chr> <chr> # 1 scen1_spinup 1 NA 8:00:00 # 2 scen1_transient 2 scen1_spinup 2:00:00 ## End(Not run)
Write an LPJmL clm header to a file. The header has to be a list
following the structure returned by read_header()
or create_header()
.
The function will fail if the output file exists already unless overwrite
is set to TRUE
.
write_header(filename, header, overwrite = FALSE)
write_header(filename, header, overwrite = FALSE)
filename |
Filename to write header into. |
header |
The header to be written. |
overwrite |
Whether to overwrite an existing output file
(default |
Returns filename
invisibly.
create_header()
for creating headers from scratch and for a more
detailed description of the LPJmL header format.
read_header()
for reading headers from files.
## Not run: header <- read_header(filename = "old_filename.clm") write_header( filename = "new_filename.clm", header = header, overwrite = FALSE ) ## End(Not run)
## Not run: header <- read_header(filename = "old_filename.clm") write_header( filename = "new_filename.clm", header = header, overwrite = FALSE ) ## End(Not run)