In probability theory , the law of total covariance ,[ 1] covariance decomposition formula , or conditional covariance formula states that if X , Y , and Z are random variables on the same probability space , and the covariance of X and Y is finite, then
cov
(
X
,
Y
)
=
E
(
cov
(
X
,
Y
∣
Z
)
)
+
cov
(
E
(
X
∣
Z
)
,
E
(
Y
∣
Z
)
)
.
{\displaystyle \operatorname {cov} (X,Y)=\operatorname {E} (\operatorname {cov} (X,Y\mid Z))+\operatorname {cov} (\operatorname {E} (X\mid Z),\operatorname {E} (Y\mid Z)).}
The nomenclature in this article's title parallels the phrase law of total variance . Some writers on probability call this the "conditional covariance formula"[ 2] or use other names.
Note: The conditional expected values E( X | Z ) and E( Y | Z ) are random variables whose values depend on the value of Z . Note that the conditional expected value of X given the event Z = z is a function of z . If we write E( X | Z = z ) = g (z ) then the random variable E( X | Z ) is g (Z ). Similar comments apply to the conditional covariance.
The law of total covariance can be proved using the law of total expectation : First,
cov
(
X
,
Y
)
=
E
[
X
Y
]
−
E
[
X
]
E
[
Y
]
{\displaystyle \operatorname {cov} (X,Y)=\operatorname {E} [XY]-\operatorname {E} [X]\operatorname {E} [Y]}
from a simple standard identity on covariances. Then we apply the law of total expectation by conditioning on the random variable Z :
=
E
[
E
[
X
Y
∣
Z
]
]
−
E
[
E
[
X
∣
Z
]
]
E
[
E
[
Y
∣
Z
]
]
{\displaystyle =\operatorname {E} {\big [}\operatorname {E} [XY\mid Z]{\big ]}-\operatorname {E} {\big [}\operatorname {E} [X\mid Z]{\big ]}\operatorname {E} {\big [}\operatorname {E} [Y\mid Z]{\big ]}}
Now we rewrite the term inside the first expectation using the definition of covariance:
=
E
[
cov
(
X
,
Y
∣
Z
)
+
E
[
X
∣
Z
]
E
[
Y
∣
Z
]
]
−
E
[
E
[
X
∣
Z
]
]
E
[
E
[
Y
∣
Z
]
]
{\displaystyle =\operatorname {E} \!{\big [}\operatorname {cov} (X,Y\mid Z)+\operatorname {E} [X\mid Z]\operatorname {E} [Y\mid Z]{\big ]}-\operatorname {E} {\big [}\operatorname {E} [X\mid Z]{\big ]}\operatorname {E} {\big [}\operatorname {E} [Y\mid Z]{\big ]}}
Since expectation of a sum is the sum of expectations, we can regroup the terms:
=
E
[
cov
(
X
,
Y
∣
Z
)
]
+
E
[
E
[
X
∣
Z
]
E
[
Y
∣
Z
]
]
−
E
[
E
[
X
∣
Z
]
]
E
[
E
[
Y
∣
Z
]
]
{\displaystyle =\operatorname {E} \!{\big [}\operatorname {cov} (X,Y\mid Z){\big ]}+\operatorname {E} {\big [}\operatorname {E} [X\mid Z]\operatorname {E} [Y\mid Z]{\big ]}-\operatorname {E} {\big [}\operatorname {E} [X\mid Z]{\big ]}\operatorname {E} {\big [}\operatorname {E} [Y\mid Z]{\big ]}}
Finally, we recognize the final two terms as the covariance of the conditional expectations E[X | Z ] and E[Y | Z ]:
=
E
[
cov
(
X
,
Y
∣
Z
)
]
+
cov
(
E
[
X
∣
Z
]
,
E
[
Y
∣
Z
]
)
{\displaystyle =\operatorname {E} {\big [}\operatorname {cov} (X,Y\mid Z){\big ]}+\operatorname {cov} {\big (}\operatorname {E} [X\mid Z],\operatorname {E} [Y\mid Z]{\big )}}
Notes and references
edit
^ Matthew R. Rudary, On Predictive Linear Gaussian Models , ProQuest, 2009, page 121.
^ Sheldon M. Ross, A First Course in Probability , sixth edition, Prentice Hall, 2002, page 392.