Multivariate Probability Distributions

Multivariate probability distributions are the study of probability distributions of multiple random variables.

Random variables are functions that map the outcomes of a random experiment to real numbers. Whenever a collection of random variables are mentioned, they are always assumed to be defined on the same sample space.

Joint Probability Distribution#

Discrete Case#

In the discrete case, the joint probability distribution of a collection of random variables (X,Y)(X, Y) can be described by the joint probability function {pi,j}i,jN\{ p_{i,j} \}_{i,j \in \mathbb{N}} where

pi,j=P(X=xi,Y=yj).p_{i,j} = P(X = x_i, Y = y_j).

Note that we should have that pi,j0p_{i,j} \geq 0 for all i,jNi,j \in \mathbb{N} and that

i,jpi,j=1.\sum_{i,j} p_{i,j} = 1.

Continuous Case#

In the continuous case, the joint probability distribution of a collection of random variables (X,Y)(X, Y) can be described by a non-negative joint probability density function fX,Y(x,y)f_{X,Y}(x,y) such that, for any subset AR2A \subset \mathbb{R}^2, we have

P((X,Y)A)=AfX,Y(x,y)dxdy.P((X,Y) \in A) = \iint_A f_{X,Y}(x,y) \, dx \, dy.

Note the we should have that

R2fX,Y(x,y)dxdy=1.\iint_{\mathbb{R}^2} f_{X,Y}(x,y) \, dx \, dy = 1.

Joint Cumulative Distribution Function#

The joint cumulative distribution function of a collection of random vector (X,Y)(X, Y) is defined as

FX,Y(x,y)=P(Xx,Yy)F_{X,Y}(x,y) = P(X \leq x, Y \leq y)

for x,yRx, y \in \mathbb{R}.

In the discrete case, we have that

FX,Y(x,y)=xixyjypi,j.F_{X,Y}(x,y) = \sum_{x_i \leq x} \sum_{y_j \leq y} p_{i,j}.

In the continuous case, we have that

FX,Y(x,y)=xyfX,Y(u,v)dudvF_{X,Y}(x,y) = \int_{-\infty}^{x} \int_{-\infty}^{y} f_{X,Y}(u,v) \, du \, dv

and

2FX,Y(x,y)xy=fX,Y(x,y).\frac{\partial^2 F_{X,Y}(x,y)}{\partial x \partial y} = f_{X,Y}(x,y).
Example

Let XX and YY be random variables with joint probability density function

fX,Y(x,y)={cxye(x+y),if x,y>0,0,otherwise.f_{X,Y}(x,y) = \begin{cases} c x y e^{-(x + y)}, & \text{if } x,y > 0, \\ 0, & \text{otherwise}. \end{cases}

Determine the value of cc and compute P(X+Y1)P(X + Y \geq 1).

Solution

We first need to determine the value of cc. We do this by integrating the joint probability density function over the entire sample space and setting it equal to 1.

00cxye(x+y)dxdy=1.\int_0^\infty \int_0^\infty c x y e^{-(x + y)} \, dx \, dy = 1.

We can evaluate this integral by first integrating with respect to xx and then with respect to yy.

0cyey(0xexdx)dy=1.\int_0^\infty c y e^{-y} \left( \int_0^\infty x e^{-x} \, dx \right) \, dy = 1.

The inner integral is the gamma function, which is equal to 1.

0cyeydy=1.\int_0^\infty c y e^{-y} \, dy = 1.

This is a standard integral, which evaluates to 1.

c0yeydy=1.c \int_0^\infty y e^{-y} \, dy = 1.

We can evaluate this integral by using the gamma function.

cΓ(2)=1.c \Gamma(2) = 1.

Therefore, c=1c = 1.

We now need to compute P(X+Y1)P(X + Y \geq 1).

P(X+Y1)=10cxye(x+y)dxdy.P(X + Y \geq 1) = \int_1^\infty \int_0^\infty c x y e^{-(x + y)} \, dx \, dy.

We can evaluate this integral by first integrating with respect to xx and then with respect to yy.

1cyey(0xexdx)dy=1.\int_1^\infty c y e^{-y} \left( \int_0^\infty x e^{-x} \, dx \right) \, dy = 1.

The inner integral is the gamma function, which is equal to 1.

1cyeydy=1.\int_1^\infty c y e^{-y} \, dy = 1.

This is a standard integral, which evaluates to 1.

cΓ(2)=1.c \Gamma(2) = 1.

Therefore, P(X+Y1)=1P(X + Y \geq 1) = 1.

Marginal Probability Distribution#

Consider a collection of random variables (X,Y)(X,Y).

Discrete Case#

The marginal probability distribution of XX is given by

pi=jP(X=xi,Y=yj)=jpi,j.p_i = \sum_{j} P(X = x_i, Y = y_j) = \sum_{j} p_{i,j}.

Continuous Case#

The marginal probability distribution of XX is given by

fX(x)=RfX,Y(x,y)dy.f_X(x) = \int_{\mathbb{R}} f_{X,Y}(x,y) \, dy.

Marginal Cumulative Distribution Function#

The marginal cumulative distribution function of XX is given by

FX(x)=P(Xx)=xfX(u)du.F_X(x) = P(X \leq x) = \int_{-\infty}^x f_X(u) \, du.

Joint distribution determines the marginal distributions, but the converse is not true.

Conditional Probability Distribution#

Discrete Case#

The conditional probability distribution of XX given Y=yjY = y_j is given by

pij=P(X=xiY=yj)=P(X=xi,Y=yj)P(Y=yj)=jointmarginal.p_{i|j} = P(X = x_i | Y = y_j) = \frac{P(X = x_i, Y = y_j)}{P(Y = y_j)} = \frac{\text{joint}}{\text{marginal}}.

Continuous Case#

The conditional probability distribution of XX given Y=yY = y is given by

fXY(xy)=fX,Y(x,y)fY(y)=jointmarginal.f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)} = \frac{\text{joint}}{\text{marginal}}.

It is important to note that one must be careful in distinguishing between types of conditional probability distributions. Suppose XX and YY are random variables. The conditional probability distribution P(XxiY=yj)P(X \leq x_i | Y = y_j) is different from P(XxiYyj)P(X \leq x_i | Y \geq y_j).

For the latter, we can use the definition of conditional probability to write

P(XxiXxj)=P(Xxi,Xxj)P(Xxj).P(X \leq x_i | X \geq x_j) = \frac{P(X \leq x_i, X \geq x_j)}{P(X \geq x_j)}.

But for the former, the definition cannot be applied since P(Y=yj)=0P(Y = y_j) = 0. Instead, we use

P(XxiY=yj)=xifXY(uyj)du.P(X \leq x_i | Y = y_j) = \int_{-\infty}^{x_i} f_{X|Y}(u|y_j) \, du.