In statistics , the matrix variate Dirichlet distribution is a generalization of the matrix variate beta distribution and of the Dirichlet distribution .
Suppose
U
1
,
…
,
U
r
{\displaystyle U_{1},\ldots ,U_{r}}
are
p
×
p
{\displaystyle p\times p}
positive definite matrices with
I
p
−
∑
i
=
1
r
U
i
{\displaystyle I_{p}-\sum _{i=1}^{r}U_{i}}
also positive-definite, where
I
p
{\displaystyle I_{p}}
is the
p
×
p
{\displaystyle p\times p}
identity matrix . Then we say that the
U
i
{\displaystyle U_{i}}
have a matrix variate Dirichlet distribution,
(
U
1
,
…
,
U
r
)
∼
D
p
(
a
1
,
…
,
a
r
;
a
r
+
1
)
{\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r};a_{r+1}\right)}
, if their joint probability density function is
{
β
p
(
a
1
,
…
,
a
r
,
a
r
+
1
)
}
−
1
∏
i
=
1
r
det
(
U
i
)
a
i
−
(
p
+
1
)
/
2
det
(
I
p
−
∑
i
=
1
r
U
i
)
a
r
+
1
−
(
p
+
1
)
/
2
{\displaystyle \left\{\beta _{p}\left(a_{1},\ldots ,a_{r},a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}\det \left(I_{p}-\sum _{i=1}^{r}U_{i}\right)^{a_{r+1}-(p+1)/2}}
where
a
i
>
(
p
−
1
)
/
2
,
i
=
1
,
…
,
r
+
1
{\displaystyle a_{i}>(p-1)/2,i=1,\ldots ,r+1}
and
β
p
(
⋯
)
{\displaystyle \beta _{p}\left(\cdots \right)}
is the multivariate beta function .
If we write
U
r
+
1
=
I
p
−
∑
i
=
1
r
U
i
{\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}}
then the PDF takes the simpler form
{
β
p
(
a
1
,
…
,
a
r
+
1
)
}
−
1
∏
i
=
1
r
+
1
det
(
U
i
)
a
i
−
(
p
+
1
)
/
2
,
{\displaystyle \left\{\beta _{p}\left(a_{1},\ldots ,a_{r+1}\right)\right\}^{-1}\prod _{i=1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2},}
on the understanding that
∑
i
=
1
r
+
1
U
i
=
I
p
{\displaystyle \sum _{i=1}^{r+1}U_{i}=I_{p}}
.
generalization of chi square-Dirichlet result
edit
Suppose
S
i
∼
W
p
(
n
i
,
Σ
)
,
i
=
1
,
…
,
r
+
1
{\displaystyle S_{i}\sim W_{p}\left(n_{i},\Sigma \right),i=1,\ldots ,r+1}
are independently distributed Wishart
p
×
p
{\displaystyle p\times p}
positive definite matrices . Then, defining
U
i
=
S
−
1
/
2
S
i
(
S
−
1
/
2
)
T
{\displaystyle U_{i}=S^{-1/2}S_{i}\left(S^{-1/2}\right)^{T}}
(where
S
=
∑
i
=
1
r
+
1
S
i
{\displaystyle S=\sum _{i=1}^{r+1}S_{i}}
is the sum of the matrices and
S
1
/
2
(
S
−
1
/
2
)
T
{\displaystyle S^{1/2}\left(S^{-1/2}\right)^{T}}
is any reasonable factorization of
S
{\displaystyle S}
), we have
(
U
1
,
…
,
U
r
)
∼
D
p
(
n
1
/
2
,
.
.
.
,
n
r
+
1
/
2
)
.
{\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(n_{1}/2,...,n_{r+1}/2\right).}
Marginal distribution
edit
If
(
U
1
,
…
,
U
r
)
∼
D
p
(
a
1
,
…
,
a
r
+
1
)
{\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)}
, and if
s
≤
r
{\displaystyle s\leq r}
, then:
(
U
1
,
…
,
U
s
)
∼
D
p
(
a
1
,
…
,
a
s
,
∑
i
=
s
+
1
r
+
1
a
i
)
{\displaystyle \left(U_{1},\ldots ,U_{s}\right)\sim D_{p}\left(a_{1},\ldots ,a_{s},\sum _{i=s+1}^{r+1}a_{i}\right)}
Conditional distribution
edit
Also, with the same notation as above, the density of
(
U
s
+
1
,
…
,
U
r
)
|
(
U
1
,
…
,
U
s
)
{\displaystyle \left(U_{s+1},\ldots ,U_{r}\right)\left|\left(U_{1},\ldots ,U_{s}\right)\right.}
is given by
∏
i
=
s
+
1
r
+
1
det
(
U
i
)
a
i
−
(
p
+
1
)
/
2
β
p
(
a
s
+
1
,
…
,
a
r
+
1
)
det
(
I
p
−
∑
i
=
1
s
U
i
)
∑
i
=
s
+
1
r
+
1
a
i
−
(
p
+
1
)
/
2
{\displaystyle {\frac {\prod _{i=s+1}^{r+1}\det \left(U_{i}\right)^{a_{i}-(p+1)/2}}{\beta _{p}\left(a_{s+1},\ldots ,a_{r+1}\right)\det \left(I_{p}-\sum _{i=1}^{s}U_{i}\right)^{\sum _{i=s+1}^{r+1}a_{i}-(p+1)/2}}}}
where we write
U
r
+
1
=
I
p
−
∑
i
=
1
r
U
i
{\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{i}}
.
partitioned distribution
edit
Suppose
(
U
1
,
…
,
U
r
)
∼
D
p
(
a
1
,
…
,
a
r
+
1
)
{\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)}
and suppose that
S
1
,
…
,
S
t
{\displaystyle S_{1},\ldots ,S_{t}}
is a partition of
[
r
+
1
]
=
{
1
,
…
r
+
1
}
{\displaystyle \left[r+1\right]=\left\{1,\ldots r+1\right\}}
(that is,
∪
i
=
1
t
S
i
=
[
r
+
1
]
{\displaystyle \cup _{i=1}^{t}S_{i}=\left[r+1\right]}
and
S
i
∩
S
j
=
∅
{\displaystyle S_{i}\cap S_{j}=\emptyset }
if
i
≠
j
{\displaystyle i\neq j}
). Then, writing
U
(
j
)
=
∑
i
∈
S
j
U
i
{\displaystyle U_{(j)}=\sum _{i\in S_{j}}U_{i}}
and
a
(
j
)
=
∑
i
∈
S
j
a
i
{\displaystyle a_{(j)}=\sum _{i\in S_{j}}a_{i}}
(with
U
r
+
1
=
I
p
−
∑
i
=
1
r
U
r
{\displaystyle U_{r+1}=I_{p}-\sum _{i=1}^{r}U_{r}}
), we have:
(
U
(
1
)
,
…
U
(
t
)
)
∼
D
p
(
a
(
1
)
,
…
,
a
(
t
)
)
.
{\displaystyle \left(U_{(1)},\ldots U_{(t)}\right)\sim D_{p}\left(a_{(1)},\ldots ,a_{(t)}\right).}
Suppose
(
U
1
,
…
,
U
r
)
∼
D
p
(
a
1
,
…
,
a
r
+
1
)
{\displaystyle \left(U_{1},\ldots ,U_{r}\right)\sim D_{p}\left(a_{1},\ldots ,a_{r+1}\right)}
. Define
U
i
=
(
U
11
(
i
)
U
12
(
i
)
U
21
(
i
)
U
22
(
i
)
)
i
=
1
,
…
,
r
{\displaystyle U_{i}=\left({\begin{array}{rr}U_{11(i)}&U_{12(i)}\\U_{21(i)}&U_{22(i)}\end{array}}\right)\qquad i=1,\ldots ,r}
where
U
11
(
i
)
{\displaystyle U_{11(i)}}
is
p
1
×
p
1
{\displaystyle p_{1}\times p_{1}}
and
U
22
(
i
)
{\displaystyle U_{22(i)}}
is
p
2
×
p
2
{\displaystyle p_{2}\times p_{2}}
. Writing the Schur complement
U
22
⋅
1
(
i
)
=
U
21
(
i
)
U
11
(
i
)
−
1
U
12
(
i
)
{\displaystyle U_{22\cdot 1(i)}=U_{21(i)}U_{11(i)}^{-1}U_{12(i)}}
we have
(
U
11
(
1
)
,
…
,
U
11
(
r
)
)
∼
D
p
1
(
a
1
,
…
,
a
r
+
1
)
{\displaystyle \left(U_{11(1)},\ldots ,U_{11(r)}\right)\sim D_{p_{1}}\left(a_{1},\ldots ,a_{r+1}\right)}
and
(
U
22.1
(
1
)
,
…
,
U
22.1
(
r
)
)
∼
D
p
2
(
a
1
−
p
1
/
2
,
…
,
a
r
−
p
1
/
2
,
a
r
+
1
−
p
1
/
2
+
p
1
r
/
2
)
.
{\displaystyle \left(U_{22.1(1)},\ldots ,U_{22.1(r)}\right)\sim D_{p_{2}}\left(a_{1}-p_{1}/2,\ldots ,a_{r}-p_{1}/2,a_{r+1}-p_{1}/2+p_{1}r/2\right).}
A. K. Gupta and D. K. Nagar 1999. "Matrix variate distributions". Chapman and Hall.