# Probability and Stats Cheat Sheet (1)

### 0- Conventions

1- r-Vectors in matrix operations are considered as column vectors (matrices):

x\in \mathbb{R}^r\equiv\begin{bmatrix}x_1\\x_2\\x_3\\ \vdots \\ x_r \end{bmatrix}\in\mathbb{R}^{r\times 1} \implies x^{\rm T}=\begin{bmatrix}x_1&x_2&x_3& \dots & x_r \end{bmatrix}

2- Column vector inner/dot product:

x.y \equiv (x)^\text T(y)=(y)^\text T(x)=x^\text Ty=y^\text Tx\ \in \mathbb{R} \ \forall x,y\in \mathbb{R}^{r\times 1}

### 1- Population expected value and covariance matrix

1- X a random r-vector [X_1, \dots, X_r]^\text{T} :

\mu_X= \text{E[X]}:=[\text E[X_1],\dots,E[X_r]]^\text T=[\mu_1,\dots,\mu_r]^\text T\\ \ \\ \text{E}[X^\text T]:=(\text E[X])^\text T \Sigma_{XX}:=\underset{\color{blue}{\text or\ Var(X)}}{\text{Cov}}(X,X):=\text E\big[(X-\mu_X)(X-\mu_X)^\text T\big]\ =\sigma_{ij}^2 \in\mathbb{R}^{r\times r}

where, \sigma_{ij}^2=\text{var}(X_i)=\text E[(X_i-\mu_i)^2] \ \text { for }i=j, and \sigma_{ij}^2=\text{cov}(X_i,X_j)=\text E[(X_i-\mu_i) (X_j-\mu_j) ] \ \text { for } i\ne j.

\Sigma_{XX}=\text E[XX^\text T]-\mu_X\mu_X^\text T \Sigma_{XX}=\text E[XX^\text T] \iff \mu_X=0

2- For random vectors X\in \mathbb{R}^{r\times 1} and Y\in\mathbb{R}^{s\times 1}:

\Sigma_{XY}:=\text{Cov}(X,Y):=\text E\big [(X-\mu_X)(Y-\mu_Y)^\text T\big]=\Sigma_{YX}^\text T\ \in\mathbb{R}^{r\times s} \Sigma_{XY}=\text E\big [XY^\text T\big] \iff \mu_X=\mu_Y=0

Remark: \Sigma_{XX} is a symmetric matrix but \Sigma_{XY} is not necessarily symmetric.

3- Y \in \mathbb {R}^{s \times 1} linearly related to X \in \mathbb {R}^{r \times 1} as Y=AX+b with A\in \mathbb {R}^{s \times r} and b\in \mathbb {R}^{s \times 1} a constant:

\mu_Y=A\mu_X + b \ \in \mathbb{R}^{s \times 1}\\ \ \\ \Sigma_{YY}=A\Sigma_{XX} A^\text T\ \in \mathbb{R}^{s \times s}

4- X,\ Y\in \mathbb {R}^{r \times 1} and Z\in \mathbb {R}^{s \times 1}:

\text {Cov}(X+Y,Z)=\Sigma_{XZ}+\Sigma_{XY}\\ \ \\ \text {Cov}(X+Y,X+Y)=\Sigma_{XX}+\Sigma_{XY}+\Sigma_{YX}+\Sigma_{YY}

5- X \in \mathbb {R}^{r \times 1}, Y\in \mathbb {R}^{s \times 1}, A\in \mathbb {R}^{r \times s} and B\in \mathbb {R}^{r \times s}:

\text {Cov}(AX,BY)=A\Sigma_{XY}B^\text T\ \in \mathbb {R}^{r \times s}

For the proof, expand \text E[(AX-\text E [AX])(BY-\text E [BY])^\text T].

In a special case:

\text {Var}(AX)=\text {Cov}(AX,AX)=A\Sigma_{XX}A^\text T\ \in \mathbb {R}^{r \times r}

6- \Sigma_{XX} is positive semi-definite. Proof: show that \forall u\in \mathbb{R}^{r\times 1},\ u^\text T\Sigma_{XX}u \ge 0. Use 5 in the proof.