Title: | Regression analysis based on global datasets |
---|---|
Description: | Model estimates parameters of model functions. |
Authors: | Benjamin Leon Bodirsky [aut, cre], Antonia Walther [aut], Xiaoxi Wang [aut], Abhijeet Mishra [aut], Eleonora Martinelli [aut] |
Maintainer: | Benjamin Leon Bodirsky <[email protected]> |
License: | LGPL-3 | file LICENSE |
Version: | 1.1.5 |
Built: | 2025-01-17 05:40:50 UTC |
Source: | https://github.com/pik-piam/regressionworlddata |
Package contains functions to estimate model parameters
Package: | regressionworlddata |
Type: | Package |
Version: | 0.1 |
Date: | 2016-09-23 |
License: | LGPL-3 |
LazyLoad: | yes |
Benjamin Leon Bodirsky, Antonia Walther
Maintainer: Benjamin Leon Bodirsky <[email protected]>
collects regression data using un-converted raw data sources, and crops the data that only joint years and countries are selected.
calcCollectRegressionData(datasources)
calcCollectRegressionData(datasources)
datasources |
All datasources that shall be returned. Due to the cropping of data which is not present in all datasources, reducing the number of datasources will increase the number of observations. |
List of magpie objects with results on country level, weight on country level, unit and description.
Benjamin Leon Bodirsky, Eleonora Martinelli, Abhijeet Mishra, Xiaoxi Wang
## Not run: calcOutput("CollectRegressionData",aggregate=F) ## End(Not run)
## Not run: calcOutput("CollectRegressionData",aggregate=F) ## End(Not run)
Adds lines of specific countries into the plot of the function nlsregression. nlsregression has to be based on magpie objects for x,y,weight
nlsAddLines( y, x, countries = 1:5, weight = NULL, x_log10 = FALSE, colors = "black", labels = TRUE )
nlsAddLines( y, x, countries = 1:5, weight = NULL, x_log10 = FALSE, colors = "black", labels = TRUE )
y |
magpie object with y values |
x |
magpie object with x values |
countries |
Choice of countries |
weight |
magpie object with weight |
x_log10 |
same as in nlsregression |
colors |
colors of the lines |
labels |
If TRUE, the region, staryear and endyear will be plotted to each line. |
vector with ISO-countrycodes
Benjamin Leon Bodirsky
## Not run: data(population_magpie) nlsregression(y=population_magpie[,,1],x=population_magpie[,,2], weight = population_magpie[,,1],func = y~a*x+b) nlsAddLines(y=population_magpie[,,1],x=population_magpie[,,2], weight = population_magpie[,,1],countries=1:3,colors=1:3) ## End(Not run)
## Not run: data(population_magpie) nlsregression(y=population_magpie[,,1],x=population_magpie[,,2], weight = population_magpie[,,1],func = y~a*x+b) nlsAddLines(y=population_magpie[,,1],x=population_magpie[,,2], weight = population_magpie[,,1],countries=1:3,colors=1:3) ## End(Not run)
Creates regression parameter estimates and plots with any function you want that has no more than two independent variables
nlsregression( func, y, x, z = NULL, startvalues = NULL, weight = NULL, weighting = TRUE, xlab = NULL, ylab = "y", header = NULL, z_plot_lines = NULL, weightcolorpoints = TRUE, x_log10 = FALSE, toPlot = "all", plot_x_function = "ignore", regressioncolor = "blue", weight_threshold = NULL, crossvalid = NULL, ... )
nlsregression( func, y, x, z = NULL, startvalues = NULL, weight = NULL, weighting = TRUE, xlab = NULL, ylab = "y", header = NULL, z_plot_lines = NULL, weightcolorpoints = TRUE, x_log10 = FALSE, toPlot = "all", plot_x_function = "ignore", regressioncolor = "blue", weight_threshold = NULL, crossvalid = NULL, ... )
func |
function that shall be fitted. Function should contain the dependent variable y and and the independent variable x, eventually a second independent variable z. All other unknowns are treated as parameters that are estimated. |
y |
dependent variable,vector |
x |
independent variable,vector |
z |
optional independent variable,vector |
startvalues |
the optimization algorithm may require starting values for the fitting procedure. provide them in a list with the parameter names: e.g. list(a=3,b=2) |
weight |
optional weight,vector |
weighting |
if weighting is TRUE, the fit will minimize the weighted residuals |
xlab |
name of x axis in plot |
ylab |
name of y axis in plot |
header |
plot function main argument |
z_plot_lines |
vector>1 of values for z you want to be plotted into the graph |
weightcolorpoints |
if TRUE, the points are clustered into three quantiles according to their weight and coloured lighter for low weights. |
x_log10 |
allows log10 scale for X axis if set to TRUE. Only changes the picture, not the regression! |
toPlot |
"all", "frame" (axis etc), "observations" (points), "regressionline" (line), "infos" (parameters, R2) |
plot_x_function |
depreciated, please do not enter into function call. |
regressioncolor |
color of regression line and paramter text |
weight_threshold |
if numeric, all countries below this threshold will be excluded (e.g. to exclude minor islands) |
crossvalid |
vector with boolean values, indicating which data should be excluded from sampling and rather be used for validation |
... |
will be passed on to function nls |
A nice picture and regression parameters or eventually some errors.
Benjamin Leon Bodirsky, Susanne Rolinski, Xiaoxi Wang
## Not run: x=1:10 y=(1:10)^2+1 z=c(10:1) # one independent variable nlsregression(func=y~a*x+b,y=y,x=x,startvalues=list(a=1,b=1)) # two independent variables nlsregression(func=y~a*x^1.1+b*z+c*x,y=y,x=x,z=z,startvalues=list(a=1,b=1,c=0)) # no fit because residuals are zero (excluded from the nls makers due to statistical reasons) nlsregression(func=y~x^a+b,y=y,x=x,z=z,startvalues=list(a=1,b=1,c=0)) DNase1 <- subset(DNase, Run == 1) DNase1$sets<- c(rep(1,8),rep(2,8)) nlsregression(func=y~a*x+b,y=DNase1$density,x=DNase1$conc,startvalues=list(a=1,b=1)) nlsregression(func=y~a*x+b*z,y=DNase1$density,x=DNase1$conc,z=DNase1$sets, startvalues=list(a=0.1344,b=0.2597)) nlsregression(func=y~a*x+b*z,y=DNase1$density,x=DNase1$conc,z=DNase1$sets, startvalues=list(a=0.1344,b=0.2597),plot_x_function=log) ## End(Not run)
## Not run: x=1:10 y=(1:10)^2+1 z=c(10:1) # one independent variable nlsregression(func=y~a*x+b,y=y,x=x,startvalues=list(a=1,b=1)) # two independent variables nlsregression(func=y~a*x^1.1+b*z+c*x,y=y,x=x,z=z,startvalues=list(a=1,b=1,c=0)) # no fit because residuals are zero (excluded from the nls makers due to statistical reasons) nlsregression(func=y~x^a+b,y=y,x=x,z=z,startvalues=list(a=1,b=1,c=0)) DNase1 <- subset(DNase, Run == 1) DNase1$sets<- c(rep(1,8),rep(2,8)) nlsregression(func=y~a*x+b,y=DNase1$density,x=DNase1$conc,startvalues=list(a=1,b=1)) nlsregression(func=y~a*x+b*z,y=DNase1$density,x=DNase1$conc,z=DNase1$sets, startvalues=list(a=0.1344,b=0.2597)) nlsregression(func=y~a*x+b*z,y=DNase1$density,x=DNase1$conc,z=DNase1$sets, startvalues=list(a=0.1344,b=0.2597),plot_x_function=log) ## End(Not run)
returns robust var-cov estimate
robust_vce(x)
robust_vce(x)
x |
regression model |
a robust estimte of variance-covariance matrix and corresponding t-value and p-value for estimated coefficients
Xiaoxi Wang
todo
toolCollectRegressionVariables(indicators)
toolCollectRegressionVariables(indicators)
indicators |
todo |
todo
Benjamin Leon Bodirsky
Regression model for the correlation of a denominator and quotient to the GDP, allowing for an additional driver z next to income.
toolRegression( denominator, quotient = NULL, func = y ~ (a * x)/(b + x), x = "IHME_USD05_PPP_pc", z = NULL, ylab = NULL, xlab = NULL, data = NULL, countries_nlsAddLines = NULL, weight = "pop", x_log10 = FALSE, crossvalid_sample = NULL, crossvalid_drawing = 1, ... )
toolRegression( denominator, quotient = NULL, func = y ~ (a * x)/(b + x), x = "IHME_USD05_PPP_pc", z = NULL, ylab = NULL, xlab = NULL, data = NULL, countries_nlsAddLines = NULL, weight = "pop", x_log10 = FALSE, crossvalid_sample = NULL, crossvalid_drawing = 1, ... )
denominator |
denominator of the dependent variable that shall be estimated using the regression |
quotient |
quotient of the dependent variable that shall be estimated using the regression |
func |
functional relation for the regression, shall be in the format y~f(x,...) with x being gdp, y being denominator/quotient, and f() being any type of functional relationship. ... can inlcude either z or parameters to be estimated. |
x |
independet variable, by default income |
z |
additional independet variable |
ylab |
name of y axis |
xlab |
name of x axis |
data |
data can be provided if Data shall not be derived by mrcommons:::calcCollectFoodDemandRegressionData() |
countries_nlsAddLines |
the number of weightiest countries or the name of countries that shall be plotted by lines in the plot |
weight |
the weight |
x_log10 |
passed on to nlsregression() |
crossvalid_sample |
sample name from madrat used for crossvalidation. Name is built as follows: crossvalid_seedX_kY X is the random seed, Y is the number of drawings. The combination of all drawings is the full sample. |
crossvalid_drawing |
selected drawing of k in crossvalidsample |
... |
further attributes that will be handed on to nlsregression(): An additional explanatory variable z can be added. A regression model has to be chosen. Startvalues can be predetermained. |
regression plot and the parameters from nlsregression
Antonia Walther, Benjamin Leon Bodirsky
## Not run: toolRegression(denominator=livestock, func=y~(a*x)/(b+x), z=NULL, startvalues=list(a=1100,b=7770) ) toolRegression(denominator=findset("kap"), quotient=findset("kfo"), func=y~(a*x)/(b+x), z=NULL, startvalues=list(a=0.5,b=7770) ) ## End(Not run)
## Not run: toolRegression(denominator=livestock, func=y~(a*x)/(b+x), z=NULL, startvalues=list(a=1100,b=7770) ) toolRegression(denominator=findset("kap"), quotient=findset("kfo"), func=y~(a*x)/(b+x), z=NULL, startvalues=list(a=0.5,b=7770) ) ## End(Not run)
creates Regression for selected options and saves calculated parametes inside the table.
toolRegressionTable( scenario = "SSP2", x = "IHME_USD05_PPP_pc", denominator = NA, z = NA, regression_database_file = "scenario_database_regressionworlddata.csv", quotient = "pop", start_1 = NA, start_2 = NA, start_3 = NA, start_4 = NA, start_5 = NA, start_6 = NA, return_value = FALSE )
toolRegressionTable( scenario = "SSP2", x = "IHME_USD05_PPP_pc", denominator = NA, z = NA, regression_database_file = "scenario_database_regressionworlddata.csv", quotient = "pop", start_1 = NA, start_2 = NA, start_3 = NA, start_4 = NA, start_5 = NA, start_6 = NA, return_value = FALSE )
scenario |
vector. Default "SSP2". Can be "SSP1", "SSP2", "SSP3", "SSP4", "SSP5" or "mix" and describes the overall scenario of the projection. |
x |
Indep Var |
denominator |
vector. Default NA. Specific fooddenominator share to make projection for. |
z |
other independent variables |
regression_database_file |
file with regressions to calculate |
quotient |
vector. Default is population ("pop") |
start_1 |
Default NA. Startvalue for 1st parameter. |
start_2 |
Default NA. Startvalue for 2nd parameter. |
start_3 |
Default NA. Startvalue for 3rd parameter. |
start_4 |
Default NA. Startvalue for 4th parameter. |
start_5 |
Default NA. Startvalue for 5th parameter. |
start_6 |
Default NA. Startvalue for 6th parameter. |
return_value |
Default to False. This is to stop printing the updated dataset on console. If you'd like to keep the updated dataset as an object, set this to true. |
data frame with additional rows containing parameters of newly calculated regression.
Abhijeet Mishra, Eleonora Martinelli