Generalized Dirichlet distribution

In statistics, the generalized Dirichlet distribution (GD) is a generalization of the Dirichlet distribution with a more general covariance structure and almost twice the number of parameters. Random variables with a GD distribution are not completely neutral .^[1]

The density function of $p_{1},\ldots ,p_{{k-1}}$ is

\left[\prod _{{i=1}}^{{k-1}}B(a_{i},b_{i})\right]^{{-1}}p_{k}^{{b_{{k-1}}-1}}\prod _{{i=1}}^{{k-1}}\left[p_{i}^{{a_{i}-1}}\left(\sum _{{j=i}}^{k}p_{j}\right)^{{b_{{i-1}}-(a_{i}+b_{i})}}\right]

where we define $p_{k}=1-\sum _{{i=1}}^{{k-1}}p_{i}$ . Here $B(x,y)$ denotes the Beta function. This reduces to the standard Dirichlet distribution if $b_{{i-1}}=a_{i}+b_{i}$ for $2\leqslant i\leqslant k-1$ ( $b_{0}$ is arbitrary).

For example, if k=4, then the density function of $p_{1},p_{2},p_{3}$ is

\left[\prod _{{i=1}}^{{3}}B(a_{i},b_{i})\right]^{{-1}}p_{1}^{{a_{1}-1}}p_{2}^{{a_{2}-1}}p_{3}^{{a_{3}-1}}p_{4}^{{b_{3}-1}}\left(p_{2}+p_{3}+p_{4}\right)^{{b_{1}-\left(a_{2}+b_{2}\right)}}\left(p_{3}+p_{4}\right)^{{b_{2}-\left(a_{3}+b_{3}\right)}}

where $p_{1}+p_{2}+p_{3}<1$ and $p_{4}=1-p_{1}-p_{2}-p_{3}$ .

Connor and Mosimann define the PDF as they did for the following reason. Define random variables $z_{1},\ldots ,z_{{k-1}}$ with $z_{1}=p_{1},z_{2}=p_{2}/\left(1-p_{1}\right),z_{3}=p_{3}/\left(1-(p_{1}+p_{2})\right),\ldots ,z_{i}=p_{i}/\left(1-p_{1}+\cdots +p_{{i-1}}\right)$ . Then $p_{1},\ldots ,p_{k}$ have the generalized Dirichlet distribution as parametrized above, if the $z_{i}$ are iid beta with parameters $a_{i},b_{i}$ , $i=1,\ldots ,k-1$ .

Alternative form given by Wong

Wong ^[2] gives the slightly more concise form for $x_{1}+\cdots +x_{k}\leqslant 1$

\prod _{{i=1}}^{k}{\frac {x_{i}^{{\alpha _{i}-1}}\left(1-x_{1}-\ldots -x_{i}\right)^{{\gamma _{i}}}}{B(\alpha _{i},\beta _{i})}}

where $\gamma _{j}=\beta _{j}-\alpha _{{j+1}}-\beta _{{j+1}}$ for $1\leqslant j\leqslant k-1$ and $\gamma _{k}=\beta _{k}-1$ . Note that Wong defines a distribution over a $k$ dimensional space (implicitly defining $x_{{k+1}}=1-\sum _{{i=1}}^{k}x_{i}$ ) while Connor and Mosiman use a $k-1$ dimensional space with $x_{k}=1-\sum _{{i=1}}^{{k-1}}x_{i}$ .

General moment function

If $X=\left(X_{1},\ldots ,X_{k}\right)\sim GD_{k}\left(\alpha _{1},\ldots ,\alpha _{k};\beta _{1},\ldots ,\beta _{k}\right)$ , then

E\left[X_{1}^{{r_{1}}}X_{2}^{{r_{2}}}\cdots X_{k}^{{r_{k}}}\right]=\prod _{{j=1}}^{k}{\frac {\Gamma \left(\alpha _{j}+\beta _{j}\right)\Gamma \left(\alpha _{j}+r_{j}\right)\Gamma \left(\beta _{j}+\delta _{j}\right)}{\Gamma \left(\alpha _{j}\right)\Gamma \left(\beta _{j}\right)\Gamma \left(\alpha _{j}+\beta _{j}+r_{j}+\delta _{j}\right)}}

where $\delta _{j}=r_{{j+1}}+r_{{j+2}}+\cdots +r_{k}$ for $j=1,2,\cdots ,k-1$ and $\delta _{k}=0$ . Thus

E\left(X_{j}\right)={\frac {\alpha _{j}}{\alpha _{j}+\beta _{j}}}\prod _{{m=1}}^{{j-1}}{\frac {\beta _{m}}{\alpha _{m}+\beta _{m}}}.

Reduction to standard Dirichlet distribution

As stated above, if $b_{{i-1}}=a_{i}+b_{i}$ for $2\leqslant i\leqslant k$ then the distribution reduces to a standard Dirichlet. This condition is different from the usual case, in which setting the additional parameters of the generalized distribution to zero results in the original distribution. However, in the case of the GDD, this results in a very complicated density function.

Bayesian analysis

Suppose $X=\left(X_{1},\ldots ,X_{k}\right)\sim GD_{k}\left(\alpha _{1},\ldots ,\alpha _{k};\beta _{1},\ldots ,\beta _{k}\right)$ is generalized Dirichlet, and that $Y|X$ is multinomial with $n$ trials (here $Y=\left(Y_{1},\ldots ,Y_{k}\right)$ ). Writing $Y_{j}=y_{j}$ for $1\leqslant j\leqslant k$ and $y_{{k+1}}=n-\sum _{{i=1}}^{k}y_{i}$ the joint posterior of $X|Y$ is a generalized Dirichlet distribution with

X|Y\sim GD_{k}\left({\alpha '}_{1},\ldots ,{\alpha '}_{k};{\beta '}_{1},\ldots ,{\beta '}_{k}\right)

where ${\alpha '}_{j}=\alpha _{j}+y_{j}$ and ${\beta '}_{j}=\beta _{j}+\sum _{{i=j+1}}^{{k+1}}y_{i}$ for $1\leqslant k.$

Sampling experiment

Wong gives the following system as an example of how the Dirichlet and generalized Dirichlet distributions differ. He posits that a large urn contains balls of $k+1$ different colours. The proportion of each colour is unknown. Write $X=(X_{1},\ldots ,X_{k})$ for the proportion of the balls with colour $j$ in the urn.

Experiment 1. Analyst 1 believes that $X\sim D(\alpha _{1},\ldots ,\alpha _{k},\alpha _{{k+1}})$ (ie, $X$ is Dirichlet with parameters $\alpha _{i}$ ). The analyst then makes $k+1$ glass boxes and puts $\alpha _{i}$ marbles of colour $i$ in box $i$ (it is assumed that the $\alpha _{i}$ are integers $\geq 1$ ). Then analyst 1 draws a ball from the urn, observes its colour (say colour $j$ ) and puts it in box $j$ . He can identify the correct box because they are transparent and the colours of the marbles within are visible. The process continues until $n$ balls have been drawn. The posterior distribution is then Dirichlet with parameters being the number of marbles in each box.

Experiment 2. Analyst 2 believes that $X$ follows a generalized Dirichlet distribution: $X\sim GD(\alpha _{1},\ldots ,\alpha _{k};\beta _{1},\ldots ,\beta _{k})$ . All parameters are again assumed to be positive integers. The analyst makes $k+1$ wooden boxes. The boxes have two areas: one for balls and one for marbles. The balls are coloured but the marbles are not coloured. Then for $j=1,\ldots ,k$ , he puts $\alpha _{j}$ balls of colour $j$ , and $\beta _{j}$ marbles, in to box $j$ . He then puts a ball of colour $k+1$ in box $k+1$ . The analyst then draws a ball from the urn. Because the boxes are wood, the analyst cannot tell which box to put the ball in (as he could in experiment 1 above); he also has a poor memory and cannot remember which box contains which colour balls. He has to discover which box is the correct one to put the ball in. He does this by opening box 1 and comparing the balls in it to the drawn ball. If the colours differ, the box is the wrong one. The analyst puts a marble (sic) in box 1 and proceeds to box 2. He repeats the process until the balls in the box match the drawn ball, at which point he puts the ball (sic) in the box with the other balls of matching colour. The analyst then draws another ball from the urn and repeats until $n$ balls are drawn. The posterior is then generalized Dirichlet with parameters $\alpha$ being the number of balls, and $\beta$ the number of marbles, in each box.

Note that in experiment 2, changing the order of the boxes has a non-trivial effect, unlike experiment 1.

References

↑ R. J. Connor and J. E. Mosiman 1969. Concepts of independence for proportions with a generalization of the Dirichlet distribution. Journal of the American Statistical Association, volume 64, pp194--206
↑ T.-T. Wong 1998. Generalized Dirichlet distribution in Bayesian analysis. Applied Mathematics and Computation, volume 97, pp165-181

Probability distributions

List

Discrete univariate with finite support	Benford Bernoulli beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher discrete uniform Zipf Zipf–Mandelbrot

Discrete univariate with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta

Continuous univariate supported on a bounded interval	arcsine ARGUS Balding–Nichols Bates beta beta rectangular Irwin–Hall Kumaraswamy logit-normal noncentral beta raised cosine reciprocal triangular U-quadratic uniform Wigner semicircle

Continuous univariate supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind beta prime Burr chi-squared chi Dagum Davis exponential-logarithmic Erlang exponential F folded normal Flory–Schulz Fréchet gamma gamma/Gompertz generalized inverse Gaussian Gompertz half-logistic half-normal Hotelling's T-squared hyper-Erlang hyperexponential hypoexponential inverse chi-squared scaled inverse chi-squared inverse Gaussian inverse gamma Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal Lomax matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami noncentral chi-squared Pareto phase-type poly-Weibull Rayleigh relativistic Breit–Wigner Rice shifted Gompertz truncated normal type-2 Gumbel Weibull Discrete Weibull Wilks's lambda

Continuous univariate supported on the whole real line	Cauchy exponential power Fisher's z Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S_U Landau Laplace asymmetric Laplace logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt

Continuous univariate with support whose type varies	generalized extreme value generalized Pareto Tukey lambda q-Gaussian q-exponential q-Weibull shifted log-logistic

Mixed continuous-discrete univariate	rectified Gaussian

Multivariate (joint)	Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart

Directional	Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham

Degenerate and singular	Degenerate Dirac delta function Singular Cantor

Families	Circular compound Poisson elliptical exponential natural exponential location-scale maximum entropy mixture Pearson Tweedie wrapped

This article is issued from Wikipedia - version of the 2/9/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.