Next: , Previous: , Up: distrib   [Contents][Index]

52.1 Introduction to distrib

Package distrib contains a set of functions for making probability computations on both discrete and continuous univariate models.

What follows is a short reminder of basic probabilistic related definitions.

Let f(x) be the density function of an absolute continuous random variable X. The distribution function is defined as

                       x
                      /
                      [
               F(x) = I     f(u) du
                      ]
                      /
                       minf

which equals the probability Pr(X <= x).

The mean value is a localization parameter and is defined as

                     inf
                    /
                    [
           E[X]  =  I   x f(x) dx
                    ]
                    /
                     minf

The variance is a measure of variation,

                 inf
                /
                [                    2
         V[X] = I     f(x) (x - E[X])  dx
                ]
                /
                 minf

which is a positive real number. The square root of the variance is the standard deviation, D[X]=sqrt(V[X]), and it is another measure of variation.

The skewness coefficient is a measure of non-symmetry,

                 inf
                /
            1   [                    3
  SK[X] = ----- I     f(x) (x - E[X])  dx
              3 ]
          D[X]  /
                 minf

And the kurtosis coefficient measures the peakedness of the distribution,

                 inf
                /
            1   [                    4
  KU[X] = ----- I     f(x) (x - E[X])  dx - 3
              4 ]
          D[X]  /
                 minf

If X is gaussian, KU[X]=0. In fact, both skewness and kurtosis are shape parameters used to measure the non–gaussianity of a distribution.

If the random variable X is discrete, the density, or probability, function f(x) takes positive values within certain countable set of numbers x_i, and zero elsewhere. In this case, the distribution function is

                       ====
                       \
                F(x) =  >    f(x )
                       /        i
                       ====
                      x <= x
                       i

The mean, variance, standard deviation, skewness coefficient and kurtosis coefficient take the form

                       ====
                       \
                E[X] =  >  x  f(x ) ,
                       /    i    i
                       ====
                        x 
                         i
                ====
                \                     2
        V[X] =   >    f(x ) (x - E[X])  ,
                /        i    i
                ====
                 x
                  i
               D[X] = sqrt(V[X]),
                     ====
              1      \                     3
  SK[X] =  -------    >    f(x ) (x - E[X])  
           D[X]^3    /        i    i
                     ====
                      x
                       i

and

                     ====
              1      \                     4
  KU[X] =  -------    >    f(x ) (x - E[X])   - 3 ,
           D[X]^4    /        i    i
                     ====
                      x
                       i

respectively.

There is a naming convention in package distrib. Every function name has two parts, the first one makes reference to the function or parameter we want to calculate,

Functions:
   Density function            (pdf_*)
   Distribution function       (cdf_*)
   Quantile                    (quantile_*)
   Mean                        (mean_*)
   Variance                    (var_*)
   Standard deviation          (std_*)
   Skewness coefficient        (skewness_*)
   Kurtosis coefficient        (kurtosis_*)
   Random variate              (random_*)

The second part is an explicit reference to the probabilistic model,

Continuous distributions:
   Normal              (*normal)
   Student             (*student_t)
   Chi^2               (*chi2)
   Noncentral Chi^2    (*noncentral_chi2)
   F                   (*f)
   Exponential         (*exp)
   Lognormal           (*lognormal)
   Gamma               (*gamma)
   Beta                (*beta)
   Continuous uniform  (*continuous_uniform)
   Logistic            (*logistic)
   Pareto              (*pareto)
   Weibull             (*weibull)
   Rayleigh            (*rayleigh)
   Laplace             (*laplace)
   Cauchy              (*cauchy)
   Gumbel              (*gumbel)

Discrete distributions:
   Binomial             (*binomial)
   Poisson              (*poisson)
   Bernoulli            (*bernoulli)
   Geometric            (*geometric)
   Discrete uniform     (*discrete_uniform)
   hypergeometric       (*hypergeometric)
   Negative binomial    (*negative_binomial)
   Finite discrete      (*general_finite_discrete)

For example, pdf_student_t(x,n) is the density function of the Student distribution with n degrees of freedom, std_pareto(a,b) is the standard deviation of the Pareto distribution with parameters a and b and kurtosis_poisson(m) is the kurtosis coefficient of the Poisson distribution with mean m.

In order to make use of package distrib you need first to load it by typing

(%i1) load("distrib")$

For comments, bugs or suggestions, please contact the author at ’riotorto AT yahoo DOT com’.

Categories: Statistical functions · Share packages · Package distrib ·

Next: , Previous: , Up: distrib   [Contents][Index]