Title: | Distribution Fitting and Evaluation |
---|---|
Description: | A library of density, distribution function, quantile function, (bounded) raw moments and random generation for a collection of distributions relevant for the firm size literature. Additionally, the package contains tools to fit these distributions using maximum likelihood and evaluate these distributions based on (i) log-likelihood ratio and (ii) deviations between the empirical and parametrically implied moments of the distributions. We add flexibility by allowing the considered distributions to be combined into piecewise composite or finite mixture distributions, as well as to be used when truncated. See Dewitte (2020) <https://hdl.handle.net/1854/LU-8644700> for a description and application of methods available in this package. |
Authors: | Ruben Dewitte [aut, cre] |
Maintainer: | Ruben Dewitte <[email protected]> |
License: | GPL-3 |
Version: | 0.0.6 |
Built: | 2024-11-13 03:43:58 UTC |
Source: | https://github.com/cran/distributionsrd |
Density, distribution function, quantile function, raw moments and random generation for the Burr distribution, also known as the Burr Type XII distribution or the Singh-Maddala distribution.
dburr(x, shape1 = 2, shape2 = 1, scale = 0.5, log = FALSE) pburr(q, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE) qburr(p, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE) mburr( r = 0, truncation = 0, shape1 = 2, shape2 = 1, scale = 0.5, lower.tail = TRUE ) rburr(n, shape1 = 2, shape2 = 1, scale = 0.5)
dburr(x, shape1 = 2, shape2 = 1, scale = 0.5, log = FALSE) pburr(q, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE) qburr(p, shape1 = 2, shape2 = 1, scale = 0.5, log.p = FALSE, lower.tail = TRUE) mburr( r = 0, truncation = 0, shape1 = 2, shape2 = 1, scale = 0.5, lower.tail = TRUE ) rburr(n, shape1 = 2, shape2 = 1, scale = 0.5)
x , q
|
vector of quantiles |
shape1 , shape2 , scale
|
Shape1, shape2 and scale of the Burr distribution, defaults to 2, 1 and 0.5. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the Fréchet distribution equals:
dburr returns the density, pburr the distribution function, qburr the quantile function, mburr the rth moment of the distribution and rburr generates random deviates.
The length of the result is determined by n for rburr, and is the maximum of the lengths of the numerical arguments for the other functions.
## Burr density plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100), shape2 = 3)) ## Demonstration of log functionality for probability and quantile function qburr(pburr(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pburr(2) mburr(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a #(truncated) random sample, for large enough samples. x <- rburr(1e5, shape2 = 3) mean(x) mburr(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mburr(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
## Burr density plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = dburr(x = seq(0, 5, length.out = 100), shape2 = 3)) ## Demonstration of log functionality for probability and quantile function qburr(pburr(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pburr(2) mburr(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a #(truncated) random sample, for large enough samples. x <- rburr(1e5, shape2 = 3) mean(x) mburr(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mburr(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Burr distribution
burr_plt(shape1 = 2, shape2 = 1, scale = 0.5, a = 1, b = 1, inv = FALSE)
burr_plt(shape1 = 2, shape2 = 1, scale = 0.5, a = 1, b = 1, inv = FALSE)
shape1 , shape2 , scale
|
Shape1, shape2 and scale of the Burr distribution, defaults to 2, 1 and 1 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable x is Burr distributed with scale shape and shape scale, then the power-law transformed variable
is Burr distributed with shape1 , shape2
and scale
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pburr(3,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=5,b=7)$coefficients pburr(5*3^7,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]])
pburr(5*0.9^7,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=5,b=7, inv=TRUE)$coefficients pburr(0.9,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rburr(1e5,shape1=2,shape2=3,scale=1) coeff = burr_plt(shape1=2,shape2=3,scale=1,a=2,b=0.5)$coefficients y = rburr(1e5,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]]) mean(2*x^0.5) mean(y) mburr(r=1,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],scale=coeff[["scale"]],lower.tail=FALSE)
This method determines the optimal scale parameter of the Inverse Pareto distribution using the iterative method (Clauset et al. 2009) that minimizes the Kolmogorov-Smirnov distance.
clauset.xmax(x, q = 1)
clauset.xmax(x, q = 1)
x |
data vector |
q |
Percentage of data to search over (starting from the smallest values) |
Returns a named list containing a
Named vector of coefficients
Minimum Kolmogorov-Smirnov distance
Number of observations in the Inverse Pareto tail
Evolution of the Inverse Pareto shape parameter over the iterations
Clauset A, Shalizi CR, Newman ME (2009). “Power-law distributions in empirical data.” SIAM review, 51(4), 661–703.
## Determine cuttof from compostie InvPareto-Lognormal distribution using Clauset's method dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmax(x = x) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1 ## Speed up method by considering values above certain quantile only dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmax(x = x, q = 0.5) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1
## Determine cuttof from compostie InvPareto-Lognormal distribution using Clauset's method dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmax(x = x) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1 ## Speed up method by considering values above certain quantile only dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 1, coeff2.sdlog = 0.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmax(x = x, q = 0.5) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff1
This method determines the optimal scale parameter of the Pareto distribution using the iterative method (Clauset et al. 2009)that minimizes the Kolmogorov-Smirnov distance.
clauset.xmin(x, q = 0)
clauset.xmin(x, q = 0)
x |
data vector |
q |
Percentage of data to search over (starting from the largest values) |
Returns a named list containing a
Named vector of coefficients
Minimum Kolmogorov-Smirnov distance
Number of observations in the Pareto tail
Evolution of the Pareto shape parameter over the iterations
Clauset A, Shalizi CR, Newman ME (2009). “Power-law distributions in empirical data.” SIAM review, 51(4), 661–703.
## Determine cuttof from compostie lognormal-Pareto distribution using Clauset's method dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmin(x = x) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2 ## Speed up method by considering values above certain quantile only dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmin(x = x, q = 0.5) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2
## Determine cuttof from compostie lognormal-Pareto distribution using Clauset's method dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmin(x = x) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2 ## Speed up method by considering values above certain quantile only dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) x <- rcomposite(1e3, dist = dist, coeff = coeff) out <- clauset.xmin(x = x, q = 0.5) out$coefficients coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1))$coeff2
Determines the weights and cutoffs of the three-composite distribution numerically applying te continuity- and differentiability condition.
coeffcomposite(dist, coeff, startc = c(1, 1))
coeffcomposite(dist, coeff, startc = c(1, 1))
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
startc |
starting values for the lower and upper cutoff, defaults to c(1,1). |
The continuity condition implies
The differentiability condition implies
Returns a named list containing a the separate distributions and their respective coefficients, as well as the cutoffs and weights of the composite distribution
# Three-composite distribution dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff1.k = 1, coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) # Two-composite distribution dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 2, coeff2.sdlog = 0.5) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) #'
# Three-composite distribution dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff1.k = 1, coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) # Two-composite distribution dist <- c("lnorm", "pareto") coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 1.5) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) dist <- c("invpareto", "lnorm") coeff <- c(coeff1.k = 1.5, coeff2.meanlog = 2, coeff2.sdlog = 0.5) coeffcomposite(dist = dist, coeff = coeff, startc = c(1, 1)) #'
Density, distribution function, quantile function, raw moments and random generation for combined (empirical, single, composite and finite mixture) truncated or complete distributions.
dcombdist( x, dist, prior = c(1), coeff, log = FALSE, compress = TRUE, lowertrunc = 0, uppertrunc = Inf ) pcombdist( q, dist, prior = 1, coeff, log.p = FALSE, lower.tail = TRUE, compress = TRUE, lowertrunc = NULL, uppertrunc = NULL ) qcombdist(p, dist, prior, coeff, log.p = FALSE, lower.tail = TRUE) mcombdist( r, truncation = NULL, dist, prior = 1, coeff, lower.tail = TRUE, compress = TRUE, uppertrunc = 0, lowertrunc = Inf ) rcombdist(n, dist, prior, coeff, uppertrunc = NULL, lowertrunc = NULL)
dcombdist( x, dist, prior = c(1), coeff, log = FALSE, compress = TRUE, lowertrunc = 0, uppertrunc = Inf ) pcombdist( q, dist, prior = 1, coeff, log.p = FALSE, lower.tail = TRUE, compress = TRUE, lowertrunc = NULL, uppertrunc = NULL ) qcombdist(p, dist, prior, coeff, log.p = FALSE, lower.tail = TRUE) mcombdist( r, truncation = NULL, dist, prior = 1, coeff, lower.tail = TRUE, compress = TRUE, uppertrunc = 0, lowertrunc = Inf ) rcombdist(n, dist, prior, coeff, uppertrunc = NULL, lowertrunc = NULL)
x , q
|
vector of quantiles |
dist |
character vector denoting the distribution(s). |
prior |
Numeric vector of prior coefficients, defaults to single vector with value one. |
coeff |
list of parameters for the distribution(s). |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
compress |
Logical indicating whether return values from individual densities of finite mixtures should be gathered or not, defaults to TRUE. |
lowertrunc , uppertrunc
|
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter |
n |
number of observations |
dcombdist gives the density, pcombdist gives the distribution function, qcombdist gives the quantile function, mcombdist gives the rth moment of the distribution and rcombdist generates random deviates.
The length of the result is determined by n for rcombdist, and is the maximum of the lengths of the numerical arguments for the other functions.
# Load necessary tools data("fit_US_cities") library(tidyverse) x <- rcombdist( n = 25359, dist = "lnorm", prior = subset(fit_US_cities, (dist == "lnorm" & components == 5))$prior[[1]], coeff = subset(fit_US_cities, (dist == "lnorm" & components == 5))$coefficients[[1]] ) # Generate data from one of the fitted functions # Evaluate functioning of dcomdist by calculating log likelihood for all distributions loglike <- fit_US_cities %>% group_by(dist, components, np, n) %>% do(loglike = sum(dcombdist(dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log = TRUE))) %>% unnest(cols = loglike) %>% mutate(NLL = -loglike, AIC = 2 * np - 2 * (loglike), BIC = log(n) * np - 2 * (loglike)) %>% arrange(NLL) # Evaluate functioning of mcombdist and pcombdist by calculating NMAD #(equivalent to the Kolmogorov-Smirnov test statistic for the zeroth moment #of the distribution) for all distributions nmad <- fit_US_cities %>% group_by(dist, components, np, n) %>% do( KS = max(abs(pempirical(q = sort(x), data = x) - pcombdist(dist = .[["dist"]], q = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]]))), nmad_0 = nmad_test(r = 0, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], stat = "max"), nmad_1 = nmad_test(r = 1, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], stat = "max") ) %>% unnest(cols = c(KS, nmad_0, nmad_1)) %>% arrange(nmad_0) # Evaluate functioning of qcombdist pcombdist by calculating NMAD (equivalent to the Kolmogorov- #Smirnov test statistic for the zeroth moment of the distribution) for all distributions test <- fit_US_cities %>% group_by(dist, components, np, n) %>% do(out = qcombdist(pcombdist(2, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log.p = TRUE), dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log.p = TRUE )) %>% unnest(cols = c(out))
# Load necessary tools data("fit_US_cities") library(tidyverse) x <- rcombdist( n = 25359, dist = "lnorm", prior = subset(fit_US_cities, (dist == "lnorm" & components == 5))$prior[[1]], coeff = subset(fit_US_cities, (dist == "lnorm" & components == 5))$coefficients[[1]] ) # Generate data from one of the fitted functions # Evaluate functioning of dcomdist by calculating log likelihood for all distributions loglike <- fit_US_cities %>% group_by(dist, components, np, n) %>% do(loglike = sum(dcombdist(dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log = TRUE))) %>% unnest(cols = loglike) %>% mutate(NLL = -loglike, AIC = 2 * np - 2 * (loglike), BIC = log(n) * np - 2 * (loglike)) %>% arrange(NLL) # Evaluate functioning of mcombdist and pcombdist by calculating NMAD #(equivalent to the Kolmogorov-Smirnov test statistic for the zeroth moment #of the distribution) for all distributions nmad <- fit_US_cities %>% group_by(dist, components, np, n) %>% do( KS = max(abs(pempirical(q = sort(x), data = x) - pcombdist(dist = .[["dist"]], q = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]]))), nmad_0 = nmad_test(r = 0, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], stat = "max"), nmad_1 = nmad_test(r = 1, dist = .[["dist"]], x = sort(x), prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], stat = "max") ) %>% unnest(cols = c(KS, nmad_0, nmad_1)) %>% arrange(nmad_0) # Evaluate functioning of qcombdist pcombdist by calculating NMAD (equivalent to the Kolmogorov- #Smirnov test statistic for the zeroth moment of the distribution) for all distributions test <- fit_US_cities %>% group_by(dist, components, np, n) %>% do(out = qcombdist(pcombdist(2, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log.p = TRUE), dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], log.p = TRUE )) %>% unnest(cols = c(out))
Coefficients of a power-law transformed combined distribution
combdist_plt( dist, prior = NULL, coeff, a = 1, b = 1, inv = FALSE, nested = FALSE )
combdist_plt( dist, prior = NULL, coeff, a = 1, b = 1, inv = FALSE, nested = FALSE )
dist |
character vector denoting the distribution(s). |
prior |
Numeric vector of prior coefficients, defaults to single vector with value one. |
coeff |
list of parameters for the distribution(s). |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
nested |
logical indicating whether results should be returned in a nested list or flat list, defaults to FALSE. |
Returns a nested or flat list containing
Named vector of coefficients
# Load necessary tools data("fit_US_cities") library(tidyverse) ## Comparing probabilites of power-law transformed transformed variables prob <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n) %>% do(prob = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob)) fit_US_cities_plt <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n, convergence) %>% do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE))) %>% unnest(cols = c(results)) prob$prob_plt <- fit_US_cities_plt %>% group_by(dist, components, np, n) %>% do(prob_plt = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob_plt)) %>% .$prob_plt prob <- prob %>% mutate(check = abs(prob - prob_plt)) prob <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n) %>% do(prob = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob)) fit_US_cities_plt <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n, convergence) %>% do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE, inv = TRUE))) %>% unnest(cols = c(results)) prob$prob_plt <- fit_US_cities_plt %>% group_by(dist, components, np, n) %>% do(prob_plt = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob_plt)) %>% .$prob_plt prob <- prob %>% mutate(check = abs(prob - prob_plt))
# Load necessary tools data("fit_US_cities") library(tidyverse) ## Comparing probabilites of power-law transformed transformed variables prob <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n) %>% do(prob = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob)) fit_US_cities_plt <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n, convergence) %>% do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE))) %>% unnest(cols = c(results)) prob$prob_plt <- fit_US_cities_plt %>% group_by(dist, components, np, n) %>% do(prob_plt = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob_plt)) %>% .$prob_plt prob <- prob %>% mutate(check = abs(prob - prob_plt)) prob <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n) %>% do(prob = pcombdist(q = 2 * 1.1^0.5, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob)) fit_US_cities_plt <- fit_US_cities %>% filter(!(dist %in% c( "exp", "invpareto_exp_pareto", "exp_pareto", "invpareto_exp", "gamma", "invpareto_gamma_pareto", "gamma_pareto", "invpareto_gamma" ))) %>% group_by(dist, components, np, n, convergence) %>% do(results = as_tibble(combdist_plt(dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]], a = 2, b = 0.5, nested = TRUE, inv = TRUE))) %>% unnest(cols = c(results)) prob$prob_plt <- fit_US_cities_plt %>% group_by(dist, components, np, n) %>% do(prob_plt = pcombdist(q = 1.1, dist = .[["dist"]], prior = .[["prior"]][[1]], coeff = .[["coefficients"]][[1]])) %>% unnest(cols = c(prob_plt)) %>% .$prob_plt prob <- prob %>% mutate(check = abs(prob - prob_plt))
Maximum Likelihood estimation for combined ( single, composite and finite mixture) truncated or complete distributions.
combdist.mle( x, dist, start = NULL, lower = NULL, upper = NULL, components = 1, nested = FALSE, steps = 1, lowertrunc = 0, uppertrunc = Inf, ... )
combdist.mle( x, dist, start = NULL, lower = NULL, upper = NULL, components = 1, nested = FALSE, steps = 1, lowertrunc = 0, uppertrunc = Inf, ... )
x |
data vector |
dist |
character vector denoting the distribution(s). |
start |
named numeric vector holding the starting values for the coefficients. |
lower , upper
|
Lower and upper bounds to the estimated coefficients, defaults to -Inf and Inf respectively. |
components |
number of components for a mixture distribution. |
nested |
logical indicating whether results should be returned in a nested list or a flat list form, defaults to FALSE. |
steps |
number of steps taken in stepflexmix, defaults to 1. |
lowertrunc , uppertrunc
|
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
... |
Additional arguments. |
Returns a named list containing a
Character vector denoting the distributions, separated by an underscore
Nr. of combined distributions
Weights assigned to the respective component distributions
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x <- rdoubleparetolognormal(1e3) combdist.mle(x = x, dist = "doubleparetolognormal") # Double-Pareto Lognormal combdist.mle(x = x, components = 2, dist = "lnorm", steps = 20) # FMM with 2 components combdist.mle( x = x, dist = c("invpareto", "lnorm", "pareto"), start = c(coeff1.k = 1, coeff2.meanlog = mean(log(x)), coeff2.sdlog = sd(log(x)), coeff3.k = 1), lower = c(1e-10, -Inf, 1e-10, 1e-10), upper = c(Inf, Inf, Inf, Inf), nested = TRUE) # composite distribution
x <- rdoubleparetolognormal(1e3) combdist.mle(x = x, dist = "doubleparetolognormal") # Double-Pareto Lognormal combdist.mle(x = x, components = 2, dist = "lnorm", steps = 20) # FMM with 2 components combdist.mle( x = x, dist = c("invpareto", "lnorm", "pareto"), start = c(coeff1.k = 1, coeff2.meanlog = mean(log(x)), coeff2.sdlog = sd(log(x)), coeff3.k = 1), lower = c(1e-10, -Inf, 1e-10, 1e-10), upper = c(Inf, Inf, Inf, Inf), nested = TRUE) # composite distribution
Density, distribution function, quantile function, raw moments and random generation for the two- or three-composite distribution.
dcomposite(x, dist, coeff, startc = c(1, 1), log = FALSE) pcomposite(q, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE) qcomposite(p, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE) mcomposite( r = 0, truncation = 0, dist, coeff, startc = c(1, 1), lower.tail = TRUE ) rcomposite(n, dist, coeff, startc = c(1, 1))
dcomposite(x, dist, coeff, startc = c(1, 1), log = FALSE) pcomposite(q, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE) qcomposite(p, dist, coeff, startc = c(1, 1), log.p = FALSE, lower.tail = TRUE) mcomposite( r = 0, truncation = 0, dist, coeff, startc = c(1, 1), lower.tail = TRUE ) rcomposite(n, dist, coeff, startc = c(1, 1))
x , q
|
vector of quantiles |
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
startc |
starting values for the lower and upper cutoff, defaults to c(1,1). |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter |
n |
number of observations |
These derivations are based on the two-composite distribution proposed by (Bakar et al. 2015). Probability Distribution Function:
Cumulative Distribution Function:
Quantile function
The lower y-bounded r-th raw moment of the distribution equals
dcomposite returns the density, pcomposite the distribution function, qcomposite the quantile function, mcomposite the rth moment of the distribution and rcomposite generates random deviates.
The length of the result is determined by n for rcomposite, and is the maximum of the lengths of the numerical arguments for the other functions.
Bakar SA, Hamzah NA, Maghsoudi M, Nadarajah S (2015). “Modeling loss data using composite models.” Insurance: Mathematics and Economics, 61, 146–154.
#' ## Three-component distribution dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5, coeff1.k = 1.5) # Compare density with the Double-Pareto Lognormal distribution plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3), dist = dist, coeff = coeff)) lines(x = seq(0, 5, length.out = 1e3), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 1e3))) # Demonstration of log functionality for probability and quantile function qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist, coeff = coeff, log.p = TRUE) # The zeroth truncated moment is equivalent to the probability function pcomposite(2, dist = dist, coeff = coeff) mcomposite(truncation = 2, dist = dist, coeff = coeff) # The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 3, coeff1.k = 1.5) x <- rcomposite(1e5, dist = dist, coeff = coeff) mean(x) mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff) sum(x[x > quantile(x, 0.1)]) / length(x) mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff) ## Two-component distribution dist <- c("lnorm", "pareto") coeff <- coeff <- c(coeff2.k = 1.5, coeff1.meanlog = -0.5, coeff1.sdlog = 0.5) # Compare density with the Right-Pareto Lognormal distribution plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3), dist = dist, coeff = coeff)) lines(x = seq(0, 5, length.out = 1e3), y = drightparetolognormal(x = seq(0, 5, length.out = 1e3))) # Demonstration of log functionality for probability and quantile function qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist, coeff = coeff, log.p = TRUE) # The zeroth truncated moment is equivalent to the probability function pcomposite(2, dist = dist, coeff = coeff) mcomposite(truncation = 2, dist = dist, coeff = coeff) # The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 3) x <- rcomposite(1e5, dist = dist, coeff = coeff) mean(x) mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff) sum(x[x > quantile(x, 0.1)]) / length(x) mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff)
#' ## Three-component distribution dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5, coeff1.k = 1.5) # Compare density with the Double-Pareto Lognormal distribution plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3), dist = dist, coeff = coeff)) lines(x = seq(0, 5, length.out = 1e3), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 1e3))) # Demonstration of log functionality for probability and quantile function qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist, coeff = coeff, log.p = TRUE) # The zeroth truncated moment is equivalent to the probability function pcomposite(2, dist = dist, coeff = coeff) mcomposite(truncation = 2, dist = dist, coeff = coeff) # The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 3, coeff1.k = 1.5) x <- rcomposite(1e5, dist = dist, coeff = coeff) mean(x) mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff) sum(x[x > quantile(x, 0.1)]) / length(x) mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff) ## Two-component distribution dist <- c("lnorm", "pareto") coeff <- coeff <- c(coeff2.k = 1.5, coeff1.meanlog = -0.5, coeff1.sdlog = 0.5) # Compare density with the Right-Pareto Lognormal distribution plot(x = seq(0, 5, length.out = 1e3), y = dcomposite(x = seq(0, 5, length.out = 1e3), dist = dist, coeff = coeff)) lines(x = seq(0, 5, length.out = 1e3), y = drightparetolognormal(x = seq(0, 5, length.out = 1e3))) # Demonstration of log functionality for probability and quantile function qcomposite(pcomposite(2, dist = dist, coeff = coeff, log.p = TRUE), dist = dist, coeff = coeff, log.p = TRUE) # The zeroth truncated moment is equivalent to the probability function pcomposite(2, dist = dist, coeff = coeff) mcomposite(truncation = 2, dist = dist, coeff = coeff) # The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. coeff <- c(coeff1.meanlog = -0.5, coeff1.sdlog = 0.5, coeff2.k = 3) x <- rcomposite(1e5, dist = dist, coeff = coeff) mean(x) mcomposite(r = 1, lower.tail = FALSE, dist = dist, coeff = coeff) sum(x[x > quantile(x, 0.1)]) / length(x) mcomposite(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, dist = dist, coeff = coeff)
Coefficients of a power-law transformed composite distribution
composite_plt(dist, coeff, a = 1, b = 1, inv = FALSE)
composite_plt(dist, coeff, a = 1, b = 1, inv = FALSE)
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
coeff |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables dist <- c("invpareto", "lnorm", "pareto") coeff <- c(coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5, coeff1.k = 1.5)
pcomposite(3,dist=dist,coeff=coeff) newcoeff = composite_plt(dist=dist,coeff=coeff,a=5,b=7)$coefficients pcomposite(5*3^7,dist=dist,coeff=newcoeff)
pcomposite(5*0.9^3,dist=dist,coeff=coeff) newcoeff = composite_plt(dist=dist,coeff=coeff,a=5,b=3,inv=TRUE)$coefficients pcomposite(0.9,dist=dist,coeff=newcoeff)
Maximum likelihood estimation of the parameters of the two-/three- composite distribution
composite.mle(x, dist, start, lower = NULL, upper = NULL)
composite.mle(x, dist, start, lower = NULL, upper = NULL)
x |
data vector |
dist |
character vector denoting the distribution of the first-, second- (and third) component respectively. If only two components are provided, the distribution reduces to the two-component distribution. |
start |
named numeric vector holding the coefficients of the first-, second- (and third) component, predeced by coeff1., coeff2. (and coeff3.), respectively. Coefficients for the last component do not have to be provided for the two-component distribution and will be disregarded. |
lower , upper
|
Lower and upper bounds to the estimated coefficients, defaults to -Inf and Inf respectively. |
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Cutoffs of the composite distribution
Length of the fitted data vector
Nr. of coefficients
Nr. of components
dist <- c("invpareto", "lnorm", "pareto") coeff <- c( coeff1.k = 1.5, coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5 ) lower <- c(1e-10, -Inf, 1e-10, 1e-10) upper <- c(Inf, Inf, Inf, Inf) x <- rcomposite(1e3, dist = dist, coeff = coeff) composite.mle(x = x, dist = dist, start = coeff + 0.2, lower = lower, upper = upper) #'
dist <- c("invpareto", "lnorm", "pareto") coeff <- c( coeff1.k = 1.5, coeff2.meanlog = -0.5, coeff2.sdlog = 0.5, coeff3.k = 1.5 ) lower <- c(1e-10, -Inf, 1e-10, 1e-10) upper <- c(Inf, Inf, Inf, Inf) x <- rcomposite(1e3, dist = dist, coeff = coeff) composite.mle(x = x, dist = dist, start = coeff + 0.2, lower = lower, upper = upper) #'
Density, distribution function, quantile function and random generation for the Double-Pareto Lognormal distribution.
ddoubleparetolognormal( x, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE ) pdoubleparetolognormal( q, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qdoubleparetolognormal( p, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mdoubleparetolognormal( r = 0, truncation = 0, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rdoubleparetolognormal( n, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5 )
ddoubleparetolognormal( x, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE ) pdoubleparetolognormal( q, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qdoubleparetolognormal( p, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mdoubleparetolognormal( r = 0, truncation = 0, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rdoubleparetolognormal( n, shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5 )
x , q
|
vector of quantiles |
shape1 , shape2 , meanlog , sdlog
|
Shapes, mean and variance of the Double-Pareto Lognormal distribution respectively, defaults to shape1=1.5, shape2=1.5, meanlog=-0.5, sdlog=0.5. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
The y-bounded r-th raw moment of the Double-Pareto Lognormal distribution equals:
ddoubleparetolognormal returns the density, pdoubleparetolognormal the distribution function, qdoubleparetolognormal the quantile function, mdoubleparetolognormal the rth moment of the distribution and rdoubleparetolognormal generates random deviates.
The length of the result is determined by n for rdoubleparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
## Double-Pareto Lognormal density plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100), shape2 = 1)) ## Double-Pareto Lognormal relates to the right-pareto Lognormal distribution if #shape1 goes to infinity pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5) prightparetolognormal(q = 6, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5) ## Double-Pareto Lognormal relates to the left-pareto Lognormal distribution if # shape2 goes to infinity pdoubleparetolognormal(q = 6, shape1 = 1.5, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) pleftparetolognormal(q = 6, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) ## Double-Pareto Lognormal relates to the Lognormal if both shape parameters go to infinity pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) plnorm(q = 6, meanlog = -0.5, sdlog = 0.5) ## Demonstration of log functionality for probability and quantile function qdoubleparetolognormal(pdoubleparetolognormal(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pdoubleparetolognormal(2) mdoubleparetolognormal(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rdoubleparetolognormal(1e5, shape2 = 3) mean(x) mdoubleparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mdoubleparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
## Double-Pareto Lognormal density plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = ddoubleparetolognormal(x = seq(0, 5, length.out = 100), shape2 = 1)) ## Double-Pareto Lognormal relates to the right-pareto Lognormal distribution if #shape1 goes to infinity pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5) prightparetolognormal(q = 6, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5) ## Double-Pareto Lognormal relates to the left-pareto Lognormal distribution if # shape2 goes to infinity pdoubleparetolognormal(q = 6, shape1 = 1.5, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) pleftparetolognormal(q = 6, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) ## Double-Pareto Lognormal relates to the Lognormal if both shape parameters go to infinity pdoubleparetolognormal(q = 6, shape1 = 1e20, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) plnorm(q = 6, meanlog = -0.5, sdlog = 0.5) ## Demonstration of log functionality for probability and quantile function qdoubleparetolognormal(pdoubleparetolognormal(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pdoubleparetolognormal(2) mdoubleparetolognormal(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rdoubleparetolognormal(1e5, shape2 = 3) mean(x) mdoubleparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mdoubleparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Double-Pareto Lognormal distribution
doubleparetolognormal_plt( shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
doubleparetolognormal_plt( shape1 = 1.5, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
shape1 , shape2 , meanlog , sdlog
|
Shapes, mean and variance of the Double-Pareto Lognormal distribution respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable y is Double-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
is Double-Pareto Lognormal distributed with .
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pdoubleparetolognormal(3,shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = doubleparetolognormal_plt(shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7)$coefficients pdoubleparetolognormal(5*3^7,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
pdoubleparetolognormal(5*0.9^7,shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = doubleparetolognormal_plt(shape1 = 1.5, shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7, inv=TRUE)$coefficients pdoubleparetolognormal(0.9,shape1=coeff[["shape1"]],shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
Maximum likelihood estimation of the parameters of the Double-Pareto Lognormal distribution.
doubleparetolognormal.mle( x, lower = c(1e-10, 1e-10, 1e-10), upper = c(Inf, Inf, Inf), start = NULL )
doubleparetolognormal.mle( x, lower = c(1e-10, 1e-10, 1e-10), upper = c(Inf, Inf, Inf), start = NULL )
x |
data vector |
lower , upper
|
Upper and lower bounds for the estimation procedure on the parameters c(shape2,shape1,sdlog), defaults to c(1e-10,1e-10,1e-10) and c(Inf,Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape2=2,shape1=2,sdlog=sd(log(x))) |
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x <- rdoubleparetolognormal(1e3) ## Pareto fit with xmin set to the minium of the sample doubleparetolognormal.mle(x = x)
x <- rdoubleparetolognormal(1e3) ## Pareto fit with xmin set to the minium of the sample doubleparetolognormal.mle(x = x)
Density, distribution function, quantile function, and raw moments for the empirical distribution.
dempirical(x, data, log = FALSE) pempirical(q, data, log.p = FALSE, lower.tail = TRUE) qempirical(p, data, lower.tail = TRUE, log.p = FALSE) mempirical(r = 0, data, truncation = NULL, lower.tail = TRUE)
dempirical(x, data, log = FALSE) pempirical(q, data, log.p = FALSE, lower.tail = TRUE) qempirical(p, data, lower.tail = TRUE, log.p = FALSE) mempirical(r = 0, data, truncation = NULL, lower.tail = TRUE)
x , q
|
vector of quantiles |
data |
data vector |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), moments are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to NULL. |
The density function is a standard Kernel density estimation for 1e6 equally spaced points. The cumulative Distribution Function:
The y-bounded r-th raw moment of the empirical distribution equals:
dempirical returns the density, pempirical the distribution function, qempirical the quantile function, mempirical gives the rth moment of the distribution or a function that allows to evaluate the rth moment of the distribution if truncation is NULL..
#' ## Generate random sample to work with x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5) ## Empirical density plot(x = seq(0, 5, length.out = 100), y = dempirical(x = seq(0, 5, length.out = 100), data = x)) # Compare empirical and parametric quantities dlnorm(0.5, meanlog = -0.5, sdlog = 0.5) dempirical(0.5, data = x) plnorm(0.5, meanlog = -0.5, sdlog = 0.5) pempirical(0.5, data = x) qlnorm(0.5, meanlog = -0.5, sdlog = 0.5) qempirical(0.5, data = x) mlnorm(r = 0, truncation = 0.5, meanlog = -0.5, sdlog = 0.5) mempirical(r = 0, truncation = 0.5, data = x) mlnorm(r = 1, truncation = 0.5, meanlog = -0.5, sdlog = 0.5) mempirical(r = 1, truncation = 0.5, data = x) ## Demonstration of log functionailty for probability and quantile function quantile(x, 0.5, type = 1) qempirical(p = pempirical(q = quantile(x, 0.5, type = 1), data = x, log.p = TRUE), data = x, log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pempirical(q = quantile(x, 0.5, type = 1), data = x) mempirical(truncation = quantile(x, 0.5, type = 1), data = x) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. mean(x) mempirical(r = 1, data = x, truncation = 0, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mempirical(r = 1, data = x, truncation = quantile(x, 0.1), lower.tail = FALSE) #'
#' ## Generate random sample to work with x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5) ## Empirical density plot(x = seq(0, 5, length.out = 100), y = dempirical(x = seq(0, 5, length.out = 100), data = x)) # Compare empirical and parametric quantities dlnorm(0.5, meanlog = -0.5, sdlog = 0.5) dempirical(0.5, data = x) plnorm(0.5, meanlog = -0.5, sdlog = 0.5) pempirical(0.5, data = x) qlnorm(0.5, meanlog = -0.5, sdlog = 0.5) qempirical(0.5, data = x) mlnorm(r = 0, truncation = 0.5, meanlog = -0.5, sdlog = 0.5) mempirical(r = 0, truncation = 0.5, data = x) mlnorm(r = 1, truncation = 0.5, meanlog = -0.5, sdlog = 0.5) mempirical(r = 1, truncation = 0.5, data = x) ## Demonstration of log functionailty for probability and quantile function quantile(x, 0.5, type = 1) qempirical(p = pempirical(q = quantile(x, 0.5, type = 1), data = x, log.p = TRUE), data = x, log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pempirical(q = quantile(x, 0.5, type = 1), data = x) mempirical(truncation = quantile(x, 0.5, type = 1), data = x) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. mean(x) mempirical(r = 1, data = x, truncation = 0, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mempirical(r = 1, data = x, truncation = quantile(x, 0.1), lower.tail = FALSE) #'
Raw moments for the exponential distribution.
mexp(r = 0, truncation = 0, rate = 1, lower.tail = TRUE)
mexp(r = 0, truncation = 0, rate = 1, lower.tail = TRUE)
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
rate |
rate of the distribution with default values of 1. |
lower.tail |
logical; if TRUE (default), moments are |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the distribution equals:
where denotes the upper incomplete gamma function.
Returns the truncated rth raw moment of the distribution.
## The zeroth truncated moment is equivalent to the probability function pexp(2, rate = 1) mexp(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rexp(1e5, rate = 1) mean(x) mexp(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mexp(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
## The zeroth truncated moment is equivalent to the probability function pexp(2, rate = 1) mexp(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rexp(1e5, rate = 1) mean(x) mexp(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mexp(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
A dataset containing 52 distribution fits to the US Census 2000 city size distributions
fit_US_cities
fit_US_cities
A data frame with 52 rows and 7 variables:
distribution
number of components
list of prior weights for the individual distribution components of FMM
list of coefficients for the distributions
Number of paramters
Number of observations
Logical indicating whether the fitting procedure converged
http://doi.org/10.3886/E113328V1
Density, distribution function, quantile function, raw moments and random generation for the Fréchet distribution.
dfrechet(x, shape = 1.5, scale = 0.5, log = FALSE) pfrechet(q, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE) qfrechet(p, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE) mfrechet(r = 0, truncation = 0, shape = 1.5, scale = 0.5, lower.tail = TRUE) rfrechet(n, shape = 1.5, scale = 0.5)
dfrechet(x, shape = 1.5, scale = 0.5, log = FALSE) pfrechet(q, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE) qfrechet(p, shape = 1.5, scale = 0.5, log.p = FALSE, lower.tail = TRUE) mfrechet(r = 0, truncation = 0, shape = 1.5, scale = 0.5, lower.tail = TRUE) rfrechet(n, shape = 1.5, scale = 0.5)
x , q
|
vector of quantiles |
shape , scale
|
Shape and scale of the Fréchet distribution, defaults to 1.5 and 0.5 respectively. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lower truncation parameter |
n |
number of observations |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the Fréchet distribution equals:
dfrechet returns the density, pfrechet the distribution function, qfrechet the quantile function, mfrechet the rth moment of the distribution and rfrechet generates random deviates.
The length of the result is determined by n for rfrechet, and is the maximum of the lengths of the numerical arguments for the other functions.
## Frechet density plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 1, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 2, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 3, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 3, scale = 2)) ## frechet is also called the inverse weibull distribution, which is available in the stats package pfrechet(q = 5, shape = 2, scale = 1.5) 1 - pweibull(q = 1 / 5, shape = 2, scale = 1 / 1.5) ## Demonstration of log functionality for probability and quantile function qfrechet(pfrechet(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pfrechet(2) mfrechet(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rfrechet(1e5, scale = 1) mean(x) mfrechet(r = 1, lower.tail = FALSE, scale = 1) sum(x[x > quantile(x, 0.1)]) / length(x) mfrechet(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, scale = 1)
## Frechet density plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 1, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 2, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 3, scale = 1)) plot(x = seq(0, 5, length.out = 100), y = dfrechet(x = seq(0, 5, length.out = 100), shape = 3, scale = 2)) ## frechet is also called the inverse weibull distribution, which is available in the stats package pfrechet(q = 5, shape = 2, scale = 1.5) 1 - pweibull(q = 1 / 5, shape = 2, scale = 1 / 1.5) ## Demonstration of log functionality for probability and quantile function qfrechet(pfrechet(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pfrechet(2) mfrechet(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rfrechet(1e5, scale = 1) mean(x) mfrechet(r = 1, lower.tail = FALSE, scale = 1) sum(x[x > quantile(x, 0.1)]) / length(x) mfrechet(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE, scale = 1)
Coefficients of a power-law transformed Fréchet distribution
frechet_plt(shape = 1.5, scale = 0.5, a = 1, b = 1, inv = FALSE)
frechet_plt(shape = 1.5, scale = 0.5, a = 1, b = 1, inv = FALSE)
shape , scale
|
Scale and shape of the Fréchet distribution, defaults to 1.5 and 0.5 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable x is Fréchet distributed with scale shape and shape scale, then the power-law transformed variable
is Fréchet distributed with scale and shape
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pfrechet(3,shape=2,scale=1) coeff = frechet_plt(shape=2,scale=1,a=5,b=7)$coefficients pfrechet(5*3^7,shape=coeff[["shape"]],scale=coeff[["scale"]])
pfrechet(5*0.8^7,shape=2,scale=1) coeff = frechet_plt(shape=2,scale=1,a=5,b=7,inv=TRUE)$coefficients pfrechet(0.8,shape=coeff[["shape"]],scale=coeff[["scale"]])
Maximum likelihood estimation of the coefficients of the Fréchet distribution
frechet.mle( x, weights = NULL, start = c(shape = 1.5, scale = 0.5), lower = c(1e-10, 1e-10), upper = c(Inf, Inf) )
frechet.mle( x, weights = NULL, start = c(shape = 1.5, scale = 0.5), lower = c(1e-10, 1e-10), upper = c(Inf, Inf) )
x |
data vector |
weights |
numeric vector for weighted MLE, should have the same length as data vector x |
start |
named vector with starting values, default to c(shape=1.5,scale=0.5) |
lower , upper
|
Lower and upper bounds to the estimated shape parameter, defaults to 1e-10 and Inf respectively |
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x = rfrechet(1e3)
## Pareto fit with xmin set to the minium of the sample frechet.mle(x=x)
Raw moments for the Gamma distribution.
mgamma( r = 0, truncation = 0, shape = 2, rate = 1, scale = 1/rate, lower.tail = TRUE )
mgamma( r = 0, truncation = 0, shape = 2, rate = 1, scale = 1/rate, lower.tail = TRUE )
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
shape , rate , scale
|
shape, rate and scale of the distribution with default values of 2 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Probability and Cumulative Distribution Function:
,
where stands for the upper incomplete gamma function function, while
stands for the lower incomplete Gamma function with upper bound
.
The y-bounded r-th raw moment of the distribution equals:
Provides the truncated rth raw moment of the distribution.
## The zeroth truncated moment is equivalent to the probability function pgamma(2,shape=2,rate=1) mgamma(truncation=2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x = rgamma(1e5,shape=2,rate=1) mean(x) mgamma(r=1,lower.tail=FALSE)
sum(x[x>quantile(x,0.1)])/length(x) mgamma(r=1,truncation=quantile(x,0.1),lower.tail=FALSE)
Density, distribution function, quantile function, raw moments and random generation for the Pareto distribution.
dinvpareto(x, k = 1.5, xmax = 5, log = FALSE, na.rm = FALSE) pinvpareto( q, k = 1.5, xmax = 5, lower.tail = TRUE, log.p = FALSE, log = FALSE, na.rm = FALSE ) qinvpareto(p, k = 1.5, xmax = 5, lower.tail = TRUE, log.p = FALSE) minvpareto(r = 0, truncation = 0, k = 1.5, xmax = 5, lower.tail = TRUE) rinvpareto(n, k = 1.5, xmax = 5)
dinvpareto(x, k = 1.5, xmax = 5, log = FALSE, na.rm = FALSE) pinvpareto( q, k = 1.5, xmax = 5, lower.tail = TRUE, log.p = FALSE, log = FALSE, na.rm = FALSE ) qinvpareto(p, k = 1.5, xmax = 5, lower.tail = TRUE, log.p = FALSE) minvpareto(r = 0, truncation = 0, k = 1.5, xmax = 5, lower.tail = TRUE) rinvpareto(n, k = 1.5, xmax = 5)
x , q
|
vector of quantiles |
xmax , k
|
Scale and shape of the Inverse Pareto distribution, defaults to 5 and 1.5 respectively. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
na.rm |
Removes values that fall outside the support of the distribution |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Inverse Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the Inverse Pareto distribution equals:
dinvpareto returns the density, pinvpareto the distribution function, qinvpareto the quantile function, minvpareto the rth moment of the distribution and rinvpareto generates random deviates.
The length of the result is determined by n for rinvpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
## Inverse invpareto density plot(x = seq(0, 5, length.out = 100), y = dinvpareto(x = seq(0, 5, length.out = 100))) ## Demonstration of log functionality for probability and quantile function qinvpareto(pinvpareto(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pinvpareto(2) minvpareto(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rinvpareto(1e5) mean(x) minvpareto(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) minvpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
## Inverse invpareto density plot(x = seq(0, 5, length.out = 100), y = dinvpareto(x = seq(0, 5, length.out = 100))) ## Demonstration of log functionality for probability and quantile function qinvpareto(pinvpareto(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function pinvpareto(2) minvpareto(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rinvpareto(1e5) mean(x) minvpareto(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) minvpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Inverse Pareto distribution
invpareto_plt(xmax = 5, k = 1.5, a = 1, b = 1, inv = FALSE)
invpareto_plt(xmax = 5, k = 1.5, a = 1, b = 1, inv = FALSE)
xmax , k
|
Scale and shape of the Inverse Pareto distribution, defaults to 5 and 1.5 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable x is Inverse Pareto-distributed with scale xmin and shape k, then the power-law transformed variable
is Inverse Pareto distributed with scale and shape
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pinvpareto(3,k=2,xmax=5) coeff = invpareto_plt(xmax=5,k=2,a=5,b=7)$coefficients pinvpareto(5*3^7,k=coeff[["k"]],xmax=coeff[["xmax"]])
pinvpareto(5*0.9^7,k=2,xmax=5) coeff = invpareto_plt(xmax=5,k=2,a=5,b=7, inv=TRUE)$coefficients pinvpareto(0.9,k=coeff[["k"]],xmax=coeff[["xmax"]])
Maximum likelihood estimation of the Inverse Pareto shape parameter using the Hill estimator.
invpareto.mle(x, xmax = NULL, clauset = FALSE, q = 1)
invpareto.mle(x, xmax = NULL, clauset = FALSE, q = 1)
x |
data vector |
xmax |
scale parameter of the Inverse Pareto distribution, set to max(x) if not provided |
clauset |
Indicator variable for calculating the scale parameter using the clauset method, overrides provided xmax |
q |
Percentage of data to search over (starting from the smallest values), dafults to 1. |
The Hill estimator equals
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x <- rinvpareto(1e3, k = 1.5, xmax = 5) ## Pareto fit with xmin set to the minium of the sample invpareto.mle(x = x) ## Pareto fit with xmin set to its real value invpareto.mle(x = x, xmax = 5) ## Pareto fit with xmin determined by the Clauset method invpareto.mle(x = x, clauset = TRUE)
x <- rinvpareto(1e3, k = 1.5, xmax = 5) ## Pareto fit with xmin set to the minium of the sample invpareto.mle(x = x) ## Pareto fit with xmin set to its real value invpareto.mle(x = x, xmax = 5) ## Pareto fit with xmin determined by the Clauset method invpareto.mle(x = x, clauset = TRUE)
Density, distribution function, quantile function and random generation for the Left-Pareto Lognormal distribution.
dleftparetolognormal(x, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE) pleftparetolognormal( q, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qleftparetolognormal( p, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mleftparetolognormal( r = 0, truncation = 0, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rleftparetolognormal(n, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
dleftparetolognormal(x, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE) pleftparetolognormal( q, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qleftparetolognormal( p, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mleftparetolognormal( r = 0, truncation = 0, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rleftparetolognormal(n, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5)
x , q
|
vector of quantiles |
shape1 , meanlog , sdlog
|
Shape, mean and variance of the Left-Pareto Lognormal distribution respectively. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
The y-bounded r-th raw moment of the Let-Pareto Lognormal distribution equals:
dleftparetolognormal gives the density, pleftparetolognormal gives the distribution function, qleftparetolognormal gives the quantile function, mleftparetolognormal gives the rth moment of the distribution and rleftparetolognormal generates random deviates.
The length of the result is determined by n for rleftparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
## Left-Pareto Lognormal density plot(x=seq(0,5,length.out=100),y=dleftparetolognormal(x=seq(0,5,length.out=100))) plot(x=seq(0,5,length.out=100),y=dleftparetolognormal(x=seq(0,5,length.out=100),shape1=1))
## Left-Pareto Lognormal relates to the Lognormal if the shape parameter goes to infinity pleftparetolognormal(q=6,shape1=1e20,meanlog=-0.5,sdlog=0.5) plnorm(q=6,meanlog=-0.5,sdlog=0.5)
## Demonstration of log functionality for probability and quantile function qleftparetolognormal(pleftparetolognormal(2,log.p=TRUE),log.p=TRUE)
## The zeroth truncated moment is equivalent to the probability function pleftparetolognormal(2) mleftparetolognormal(truncation=2)
## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x = rleftparetolognormal(1e5)
mean(x) mleftparetolognormal(r=1,lower.tail=FALSE)
sum(x[x>quantile(x,0.1)])/length(x) mleftparetolognormal(r=1,truncation=quantile(x,0.1),lower.tail=FALSE)
Coefficients of a power-law transformed Left-Pareto Lognormal distribution
leftparetolognormal_plt( shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
leftparetolognormal_plt( shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
shape1 , meanlog , sdlog
|
Shapes, mean and variance of the Left-Pareto Lognormal distribution respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable y is Left-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
is Left-Pareto Lognormal distributed with .
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pleftparetolognormal(3, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 5, b = 7)$coefficients pleftparetolognormal(5 * 3^7, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]], sdlog = coeff[["sdlog"]]) pleftparetolognormal(5 * 0.9^7, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 5, b = 7, inv = TRUE)$coefficients pleftparetolognormal(0.9, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]], sdlog = coeff[["sdlog"]])
## Comparing probabilites of power-law transformed transformed variables pleftparetolognormal(3, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 5, b = 7)$coefficients pleftparetolognormal(5 * 3^7, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]], sdlog = coeff[["sdlog"]]) pleftparetolognormal(5 * 0.9^7, shape1 = 1.5, meanlog = -0.5, sdlog = 0.5) coeff <- leftparetolognormal_plt(shape1 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 5, b = 7, inv = TRUE)$coefficients pleftparetolognormal(0.9, shape1 = coeff[["shape1"]], meanlog = coeff[["meanlog"]], sdlog = coeff[["sdlog"]])
Maximum likelihood estimation of the parameters of the Left-Pareto Lognormal distribution.
leftparetolognormal.mle( x, lower = c(1e-10, 1e-10), upper = c(Inf, Inf), start = NULL )
leftparetolognormal.mle( x, lower = c(1e-10, 1e-10), upper = c(Inf, Inf), start = NULL )
x |
data vector |
lower , upper
|
Upper and lower bounds for the estimation procedure on the parameters c(shape1,sdlog), defaults to c(1e-10,1e-10) and c(Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape1=2,sdlog=sd(log(x))) |
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x = rleftparetolognormal(1e3)
## Pareto fit with xmin set to the minium of the sample leftparetolognormal.mle(x=x)
Likelihood ratio test for model selection using the Kullback-Leibler information criterion (Vuong 1989)
llr_vuong(x, y, np.x, np.y, corr = c("none", "BIC", "AIC"))
llr_vuong(x, y, np.x, np.y, corr = c("none", "BIC", "AIC"))
x , y
|
vector of log-likelihoods |
np.x , np.y
|
Number of paremeters respectively |
corr |
type of correction for parameters, defaults to none. |
returns data frame with test statistic, p-value and character vector indicating the test outcome.
Vuong QH (1989). “Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses.” Econometrica, 57(2), 307–333.
x <- rlnorm(1e4, meanlog = -0.5, sdlog = 0.5) pareto_fit <- combdist.mle(x = x, dist = "pareto") pareto_loglike <- dcombdist(x = x, dist = "pareto", coeff = pareto_fit$coefficients, log = TRUE) lnorm_fit <- combdist.mle(x = x, dist = "lnorm") lnorm_loglike <- dcombdist(x = x, dist = "lnorm", coeff = lnorm_fit$coefficients, log = TRUE) llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np) # BIC type parameter correction llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np, corr = "BIC") # AIC type parameter correction llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np, corr = "AIC")
x <- rlnorm(1e4, meanlog = -0.5, sdlog = 0.5) pareto_fit <- combdist.mle(x = x, dist = "pareto") pareto_loglike <- dcombdist(x = x, dist = "pareto", coeff = pareto_fit$coefficients, log = TRUE) lnorm_fit <- combdist.mle(x = x, dist = "lnorm") lnorm_loglike <- dcombdist(x = x, dist = "lnorm", coeff = lnorm_fit$coefficients, log = TRUE) llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np) # BIC type parameter correction llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np, corr = "BIC") # AIC type parameter correction llr_vuong(x = pareto_loglike, y = lnorm_loglike, np.x = pareto_fit$np, np.y = lnorm_fit$np, corr = "AIC")
Raw moments for the Lognormal distribution.
mlnorm(r = 0, truncation = 0, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE)
mlnorm(r = 0, truncation = 0, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE)
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
meanlog , sdlog
|
mean and standard deviation of the distribution on the log scale with default values of 0 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the Lognormal distribution equals:
Provides the y-bounded, rth raw moment of the distribution.
## The zeroth truncated moment is equivalent to the probability function plnorm(2, meanlog = -0.5, sdlog = 0.5) mlnorm(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5) mean(x) mlnorm(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mlnorm(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
## The zeroth truncated moment is equivalent to the probability function plnorm(2, meanlog = -0.5, sdlog = 0.5) mlnorm(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rlnorm(1e5, meanlog = -0.5, sdlog = 0.5) mean(x) mlnorm(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mlnorm(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed log normal distribution
lnorm_plt(meanlog = 0, sdlog = 1, a = 1, b = 1, inv = FALSE)
lnorm_plt(meanlog = 0, sdlog = 1, a = 1, b = 1, inv = FALSE)
meanlog , sdlog
|
mean and standard deviation of the log normal distributed variable, defaults to 0 and 1 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable y is log normally distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
is log normally distributed with mean and standard deviation
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables plnorm(3,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=5,b=7)$coefficients plnorm(5*3^7,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
plnorm(5*0.8^7,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=5,b=7,inv=TRUE)$coefficients plnorm(0.8,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rlnorm(1e5,meanlog=-0.5,sdlog=0.5) coeff = lnorm_plt(meanlog=-0.5,sdlog=0.5,a=2,b=0.5)$coefficients y = rlnorm(1e5,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]]) mean(2*x^0.5) mean(y) mlnorm(r=1,meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]],lower.tail=FALSE)
Calculates the Normalized Absolute Deviation between the empirical moments and the moments of the provided distribution. Corresponds to the Kolmogorov-Smirnov test statistic for the zeroth moment.
nmad_test( x, r = 0, dist, prior = 1, coeff, stat = c("NULL", "max", "sum"), ... )
nmad_test( x, r = 0, dist, prior = 1, coeff, stat = c("NULL", "max", "sum"), ... )
x |
data vector |
r |
moment parameter |
dist |
character vector containing distribution |
prior |
named list of priors, defaults to 1 |
coeff |
named list of coefficients |
stat |
character vector indicating which statistic should be calculated: none (NULL), the maximum deviation "max" or the sum of deviations "sum". Defaults to NULL. |
... |
Additional arguments can be passed to the parametric moment call. |
x <- rlnorm(1e2, meanlog = -0.5, sdlog = 0.5) nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5)) nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "max") nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "sum")
x <- rlnorm(1e2, meanlog = -0.5, sdlog = 0.5) nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5)) nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "max") nmad_test(x = x, r = 0, dist = "lnorm", coeff = c(meanlog = -0.5, sdlog = 0.5), stat = "sum")
Density, distribution function, quantile function, raw moments and random generation for the Pareto distribution.
dpareto(x, k = 2, xmin = 1, log = FALSE, na.rm = FALSE) ppareto(q, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE, na.rm = FALSE) qpareto(p, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE) mpareto(r = 0, truncation = xmin, k = 2, xmin = 1, lower.tail = TRUE) rpareto(n, k = 2, xmin = 1)
dpareto(x, k = 2, xmin = 1, log = FALSE, na.rm = FALSE) ppareto(q, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE, na.rm = FALSE) qpareto(p, k = 2, xmin = 1, lower.tail = TRUE, log.p = FALSE) mpareto(r = 0, truncation = xmin, k = 2, xmin = 1, lower.tail = TRUE) rpareto(n, k = 2, xmin = 1)
x , q
|
vector of quantiles |
xmin , k
|
Scale and shape of the Pareto distribution, defaults to 1 and 2 respectively. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
na.rm |
Removes values that fall outside the support of the distribution |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the Pareto distribution equals:
dpareto returns the density, ppareto the distribution function, qpareto the quantile function, mpareto the rth moment of the distribution and rpareto generates random deviates.
The length of the result is determined by n for rpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
## Pareto density plot(x = seq(1, 5, length.out = 100), y = dpareto(x = seq(1, 5, length.out = 100), k = 2, xmin = 1)) ## Pareto relates to the exponential distribution available in the stats package ppareto(q = 5, k = 2, xmin = 3) pexp(q = log(5 / 3), rate = 2) ## Demonstration of log functionality for probability and quantile function qpareto(ppareto(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function ppareto(2) mpareto(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rpareto(1e5) mean(x) mpareto(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
## Pareto density plot(x = seq(1, 5, length.out = 100), y = dpareto(x = seq(1, 5, length.out = 100), k = 2, xmin = 1)) ## Pareto relates to the exponential distribution available in the stats package ppareto(q = 5, k = 2, xmin = 3) pexp(q = log(5 / 3), rate = 2) ## Demonstration of log functionality for probability and quantile function qpareto(ppareto(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function ppareto(2) mpareto(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rpareto(1e5) mean(x) mpareto(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mpareto(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Pareto distribution
pareto_plt(xmin = 1, k = 2, a = 1, b = 1, inv = FALSE)
pareto_plt(xmin = 1, k = 2, a = 1, b = 1, inv = FALSE)
xmin , k
|
Scale and shape of the Pareto distribution, defaults to 1 and 2 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable x is Pareto-distributed with scale xmin and shape k, then the power-law transformed variable
is Pareto distributed with scale and shape
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables ppareto(3, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7)$coefficients ppareto(5 * 3^7, k = coeff[["k"]], xmin = coeff[["xmin"]]) ppareto(5 * 0.9^7, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7, inv = TRUE)$coefficients ppareto(0.9, k = coeff[["k"]], xmin = coeff[["xmin"]]) ## Comparing the first moments and sample means of power-law transformed variables for #large enough samples x <- rpareto(1e5, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 2, b = 0.5)$coefficients y <- rpareto(1e5, k = coeff[["k"]], xmin = coeff[["xmin"]]) mean(2 * x^0.5) mean(y) mpareto(r = 1, k = coeff[["k"]], xmin = coeff[["xmin"]], lower.tail = FALSE)
## Comparing probabilites of power-law transformed transformed variables ppareto(3, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7)$coefficients ppareto(5 * 3^7, k = coeff[["k"]], xmin = coeff[["xmin"]]) ppareto(5 * 0.9^7, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 5, b = 7, inv = TRUE)$coefficients ppareto(0.9, k = coeff[["k"]], xmin = coeff[["xmin"]]) ## Comparing the first moments and sample means of power-law transformed variables for #large enough samples x <- rpareto(1e5, k = 2, xmin = 2) coeff <- pareto_plt(xmin = 2, k = 2, a = 2, b = 0.5)$coefficients y <- rpareto(1e5, k = coeff[["k"]], xmin = coeff[["xmin"]]) mean(2 * x^0.5) mean(y) mpareto(r = 1, k = coeff[["k"]], xmin = coeff[["xmin"]], lower.tail = FALSE)
Maximum likelihood estimation of the Pareto shape parameter using the Hill estimator.
pareto.mle(x, xmin = NULL, clauset = FALSE, q = 0, lower = 1e-10, upper = Inf)
pareto.mle(x, xmin = NULL, clauset = FALSE, q = 0, lower = 1e-10, upper = Inf)
x |
data vector |
xmin |
scale parameter of the Pareto distribution, set to min(x) if not provided |
clauset |
Indicator variable for calculating the scale parameter using the clauset method, overrides provided xmin |
q |
Percentage of data to search over (starting from the largest values), defaults to 0. |
lower , upper
|
Lower and upper bounds to the estimated shape parameter, defaults to 1e-10 and Inf respectively |
The Hill estimator equals
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x <- rpareto(1e3, k = 2, xmin = 2) ## Pareto fit with xmin set to the minium of the sample pareto.mle(x = x) ## Pareto fit with xmin set to its real value pareto.mle(x = x, xmin = 2) ## Pareto fit with xmin determined by the Clauset method pareto.mle(x = x, clauset = TRUE)
x <- rpareto(1e3, k = 2, xmin = 2) ## Pareto fit with xmin set to the minium of the sample pareto.mle(x = x) ## Pareto fit with xmin set to its real value pareto.mle(x = x, xmin = 2) ## Pareto fit with xmin determined by the Clauset method pareto.mle(x = x, clauset = TRUE)
Density, distribution function, quantile function and random generation for the Right-Pareto Lognormal distribution.
drightparetolognormal( x, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE ) prightparetolognormal( q, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qrightparetolognormal( p, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mrightparetolognormal( r = 0, truncation = 0, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rrightparetolognormal( n, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE )
drightparetolognormal( x, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, log = FALSE ) prightparetolognormal( q, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) qrightparetolognormal( p, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE, log.p = FALSE ) mrightparetolognormal( r = 0, truncation = 0, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE ) rrightparetolognormal( n, shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, lower.tail = TRUE )
x , q
|
vector of quantiles |
shape2 , meanlog , sdlog
|
Shape, mean and variance of the Right-Pareto Lognormal distribution respectively. |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the Pareto distribution |
truncation |
lower truncation parameter, defaults to xmin |
n |
number of observations |
Probability and Cumulative Distribution Function as provided by (Reed and Jorgensen 2004):
The y-bounded r-th raw moment of the Right-Pareto Lognormal distribution equals:
drightparetolognormal gives the density, prightparetolognormal gives the distribution function, qrightparetolognormal gives the quantile function, mrightparetolognormal gives the rth moment of the distribution and rrightparetolognormal generates random deviates.
The length of the result is determined by n for rrightparetolognormal, and is the maximum of the lengths of the numerical arguments for the other functions.
Reed WJ, Jorgensen M (2004). “The Double Pareto-Lognormal Distribution–A New Parametric Model for Size Distributions.” Communications in Statistics - Theory and Methods, 33(8), 1733–1753.
## Right-Pareto Lognormal density plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100), shape2 = 1)) ## Right-Pareto Lognormal relates to the Lognormal if the shape parameter goes to infinity prightparetolognormal(q = 6, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) plnorm(q = 6, meanlog = -0.5, sdlog = 0.5) ## Demonstration of log functionality for probability and quantile function qrightparetolognormal(prightparetolognormal(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function prightparetolognormal(2) mrightparetolognormal(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rrightparetolognormal(1e5, shape2 = 3) mean(x) mrightparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mrightparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
## Right-Pareto Lognormal density plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100))) plot(x = seq(0, 5, length.out = 100), y = drightparetolognormal(x = seq(0, 5, length.out = 100), shape2 = 1)) ## Right-Pareto Lognormal relates to the Lognormal if the shape parameter goes to infinity prightparetolognormal(q = 6, shape2 = 1e20, meanlog = -0.5, sdlog = 0.5) plnorm(q = 6, meanlog = -0.5, sdlog = 0.5) ## Demonstration of log functionality for probability and quantile function qrightparetolognormal(prightparetolognormal(2, log.p = TRUE), log.p = TRUE) ## The zeroth truncated moment is equivalent to the probability function prightparetolognormal(2) mrightparetolognormal(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rrightparetolognormal(1e5, shape2 = 3) mean(x) mrightparetolognormal(r = 1, shape2 = 3, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mrightparetolognormal(r = 1, shape2 = 3, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Right-Pareto Lognormal distribution
rightparetolognormal_plt( shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
rightparetolognormal_plt( shape2 = 1.5, meanlog = -0.5, sdlog = 0.5, a = 1, b = 1, inv = FALSE )
shape2 , meanlog , sdlog
|
Shapes, mean and variance of the Right-Pareto Lognormal distribution respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable y is Right-Pareto Lognormal distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
is Right-Pareto Lognormal distributed with .
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables prightparetolognormal(3, shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = rightparetolognormal_plt( shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7)$coefficients prightparetolognormal(5*3^7,shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
prightparetolognormal(5*0.9^7,shape2 = 3, meanlog = -0.5, sdlog = 0.5) coeff = rightparetolognormal_plt(shape2 = 3, meanlog = -0.5, sdlog = 0.5,a=5,b=7, inv=TRUE)$coefficients prightparetolognormal(0.9,shape2=coeff[["shape2"]],meanlog=coeff[["meanlog"]],sdlog=coeff[["sdlog"]])
Maximum likelihood estimation of the parameters of the Right-Pareto Lognormal distribution.
rightparetolognormal.mle( x, lower = c(1e-10, 1e-10), upper = c(Inf, Inf), start = NULL )
rightparetolognormal.mle( x, lower = c(1e-10, 1e-10), upper = c(Inf, Inf), start = NULL )
x |
data vector |
lower , upper
|
Upper and lower bounds for the estimation procedure on the parameters c(shape2,sdlog), defaults to c(1e-10,1e-10) and c(Inf,Inf) respectively. |
start |
named vector with starting values, default to c(shape2=2,sdlog=sd(log(x))) |
Returns a named list containing a
Named vector of coefficients
logical indicator of convergence
Length of the fitted data vector
Nr. of coefficients
x <- rrightparetolognormal(1e3) ## Pareto fit with xmin set to the minium of the sample rightparetolognormal.mle(x = x)
x <- rrightparetolognormal(1e3) ## Pareto fit with xmin set to the minium of the sample rightparetolognormal.mle(x = x)
Density, distribution function, quantile function, raw moments and random generation for a truncated distribution.
dtruncdist( x, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, log = FALSE ) ptruncdist( q, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, log.p = FALSE, lower.tail = TRUE ) qtruncdist( p, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, lower.tail = TRUE, log.p = FALSE ) mtruncdist( r, truncation = 0, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, lower.tail = TRUE ) rtruncdist( n, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf )
dtruncdist( x, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, log = FALSE ) ptruncdist( q, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, log.p = FALSE, lower.tail = TRUE ) qtruncdist( p, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, lower.tail = TRUE, log.p = FALSE ) mtruncdist( r, truncation = 0, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf, lower.tail = TRUE ) rtruncdist( n, dist = c("lnormtrunc"), coeff = list(meanlog = 0, sdlog = 1), lowertrunc = 0, uppertrunc = Inf )
x , q
|
vector of quantiles |
dist |
distribution to be truncated, defaults to lnorm |
coeff |
list of parameters for the truncated distribution, defaults to list(meanlog=0,sdlog=1) |
lowertrunc , uppertrunc
|
lowertrunc- and uppertrunc truncation points, defaults to 0 and Inf respectively |
log , log.p
|
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities (moments) are |
p |
vector of probabilities |
r |
rth raw moment of the distribution |
truncation |
lowertrunc truncation parameter, defaults to 0. |
n |
number of observations |
Probability and Cumulative Distribution Function:
dtruncdist gives the density, ptruncdist gives the distribution function, qtruncdist gives the quantile function, mtruncdist gives the rth moment of the distribution and rtruncdist generates random deviates.
The length of the result is determined by n for rpareto, and is the maximum of the lengths of the numerical arguments for the other functions.
## Truncated lognormal density plot(x = seq(0.5, 3, length.out = 100), y = dtruncdist(x = seq(0.5, 5, length.out = 100), dist = "lnorm", coeff = list(meanlog = 0.5, sdlog = 0.5), lowertrunc = 0.5, uppertrunc = 5)) lines(x = seq(0, 6, length.out = 100), y = dlnorm(x = seq(0, 6, length.out = 100), meanlog = 0.5, sdlog = 0.5)) # Compare quantities dtruncdist(0.5) dlnorm(0.5) dtruncdist(0.5, lowertrunc = 0.5, uppertrunc = 3) ptruncdist(2) plnorm(2) ptruncdist(2, lowertrunc = 0.5, uppertrunc = 3) qtruncdist(0.25) qlnorm(0.25) qtruncdist(0.25, lowertrunc = 0.5, uppertrunc = 3) mtruncdist(r = 0, truncation = 2) mlnorm(r = 0, truncation = 2, meanlog = 0, sdlog = 1) mtruncdist(r = 0, truncation = 2, lowertrunc = 0.5, uppertrunc = 3) mtruncdist(r = 1, truncation = 2) mlnorm(r = 1, truncation = 2, meanlog = 0, sdlog = 1) mtruncdist(r = 1, truncation = 2, lowertrunc = 0.5, uppertrunc = 3)
## Truncated lognormal density plot(x = seq(0.5, 3, length.out = 100), y = dtruncdist(x = seq(0.5, 5, length.out = 100), dist = "lnorm", coeff = list(meanlog = 0.5, sdlog = 0.5), lowertrunc = 0.5, uppertrunc = 5)) lines(x = seq(0, 6, length.out = 100), y = dlnorm(x = seq(0, 6, length.out = 100), meanlog = 0.5, sdlog = 0.5)) # Compare quantities dtruncdist(0.5) dlnorm(0.5) dtruncdist(0.5, lowertrunc = 0.5, uppertrunc = 3) ptruncdist(2) plnorm(2) ptruncdist(2, lowertrunc = 0.5, uppertrunc = 3) qtruncdist(0.25) qlnorm(0.25) qtruncdist(0.25, lowertrunc = 0.5, uppertrunc = 3) mtruncdist(r = 0, truncation = 2) mlnorm(r = 0, truncation = 2, meanlog = 0, sdlog = 1) mtruncdist(r = 0, truncation = 2, lowertrunc = 0.5, uppertrunc = 3) mtruncdist(r = 1, truncation = 2) mlnorm(r = 1, truncation = 2, meanlog = 0, sdlog = 1) mtruncdist(r = 1, truncation = 2, lowertrunc = 0.5, uppertrunc = 3)
Raw moments for the Weibull distribution.
mweibull(r = 0, truncation = 0, shape = 2, scale = 1, lower.tail = TRUE)
mweibull(r = 0, truncation = 0, shape = 2, scale = 1, lower.tail = TRUE)
r |
rth raw moment of the distribution, defaults to 1. |
truncation |
lower truncation parameter, defaults to 0. |
shape , scale
|
shape and scale of the distribution with default values of 2 and 1 respectively. |
lower.tail |
logical; if TRUE (default), moments are |
Probability and Cumulative Distribution Function:
The y-bounded r-th raw moment of the distribution equals:
where denotes the upper incomplete gamma function.
returns the truncated rth raw moment of the distribution.
## The zeroth truncated moment is equivalent to the probability function pweibull(2, shape = 2, scale = 1) mweibull(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rweibull(1e5, shape = 2, scale = 1) mean(x) mweibull(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mweibull(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
## The zeroth truncated moment is equivalent to the probability function pweibull(2, shape = 2, scale = 1) mweibull(truncation = 2) ## The (truncated) first moment is equivalent to the mean of a (truncated) random sample, #for large enough samples. x <- rweibull(1e5, shape = 2, scale = 1) mean(x) mweibull(r = 1, lower.tail = FALSE) sum(x[x > quantile(x, 0.1)]) / length(x) mweibull(r = 1, truncation = quantile(x, 0.1), lower.tail = FALSE)
Coefficients of a power-law transformed Weibull distribution
weibull_plt(scale = 1, shape = 2, a = 1, b = 1, inv = FALSE)
weibull_plt(scale = 1, shape = 2, a = 1, b = 1, inv = FALSE)
shape , scale
|
shape and scale of the distribution with default values of 2 and 1 respectively. |
a , b
|
constant and power of power-law transformation, defaults to 1 and 1 respectively. |
inv |
logical indicating whether coefficients of the outcome variable of the power-law transformation should be returned (FALSE) or whether coefficients of the input variable being power-law transformed should be returned (TRUE). Defaults to FALSE. |
If the random variable y is Weibull distributed with mean meanlog and standard deviation sdlog, then the power-law transformed variable
is Weibull distributed with scale and shape
.
Returns a named list containing
Named vector of coefficients
## Comparing probabilites of power-law transformed transformed variables pweibull(3,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=5,b=7)$coefficients pweibull(5*3^7,shape=coeff[["shape"]],scale=coeff[["scale"]])
pweibull(5*0.8^7,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=5,b=7,inv=TRUE)$coefficients pweibull(0.8,shape=coeff[["shape"]],scale=coeff[["scale"]])
## Comparing the first moments and sample means of power-law transformed variables for large enough samples x = rweibull(1e5,shape=2,scale=1) coeff = weibull_plt(shape=2,scale=1,a=2,b=0.5)$coefficients y = rweibull(1e5,shape=coeff[["shape"]],scale=coeff[["scale"]]) mean(2*x^0.5) mean(y) mweibull(r=1,shape=coeff[["shape"]],scale=coeff[["scale"]],lower.tail=FALSE)