Gaussian distribution

\begin{equation} \mathcal{N}(x|\mu, \Sigma) = \qty(2\pi)^{-\frac{n}{2}} |\Sigma|^{-\frac{1}{2}} \exp \qty(-\frac{1}{2} \qty(x-\mu)^{\top} \Sigma^{-1}(x-\mu)) \end{equation}

where \(\Sigma\) is positive semidefinite

conditioning Gaussian distributions

For distributions that follow Gaussian distributions, \(a, b\), we obtain:

\begin{align} \mqty[a \\ b] \sim \mathcal{N} \qty(\mqty[\mu_{a}\\ \mu_{b}], \mqty(A & C \\ C^{\top} & B)) \end{align}

meaning, each one can be marginalized as:

\begin{align} a \sim \mathcal{N}(\mu_{a}, A) \\ b \sim \mathcal{N}(\mu_{b}, B) \\ \end{align}

Conditioning works too with those terms, for \(a|b\):

\begin{align} \mu_{a|b} &= \mu_a + CB^{-1}\qty(b - \mu_{b}) \\ \sigma_{a|b} &= A - CB^{-1}C^{\top} \end{align}

standard normal density function

This is a function used to model many Gaussian distributions.

\begin{equation} \phi(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{x^{2}}{2}} \end{equation}

This function is the CDF of the standard normal.

standard normal density function is also symmetric:

\begin{equation} \phi(a) = 1- \phi(a) \end{equation}

Gaussian distribution

constituents

\(\mu\) the mean
\(\sigma\) the variance

requirements

\begin{equation} X \sim N(\mu, \sigma^{2}) \end{equation}

Its PDF is:

\begin{equation} \mathcal{N}(x \mid \mu, \sigma^{2}) = \frac{1}{\sigma\sqrt{2\pi}} e^{ \frac{-(x-u)^{2}}{2 \sigma^{2}}} \end{equation}

where, \(\phi\) is the standard normal density function

Its CDF:

\begin{equation} F(x) = \Phi \qty( \frac{x-\mu}{\sigma}) \end{equation}

We can’t integrate \(\Phi\) further. So we leave it as a special function.

And its expectations:

\(E(X) = \mu\)

\(Var(X) = \sigma^{2}\)

additional information

linear transformations on Gaussian

For some:

\begin{equation} Y = aX + b \end{equation}

where \(X \sim \mathcal{N}\)

We will end up with another normal \(Y \sim \mathcal{N}\) such that:

mean: \(au + b\)
variance: \(a^{2}\sigma^{2}\)

standard normal

The standard normal is:

\begin{equation} Z=\mathcal{N}(0,1) \end{equation}

mean 0, variance 1. You can transform anything into a standard normal via the following linear transform:

transformation into standard normal

\begin{equation} X \sim \mathcal{N}(\mu, \sigma^{2}) \end{equation}

and, we can shift it into a standard normal with:

\begin{equation} Z = \frac{X-\mu}{\sigma} \end{equation}

therefore, we can derive what the CDF of the normal distribution by shifting it back into the center:

\begin{equation} P(X<x) \implies P\qty(\frac{X-\mu}{\theta} < \frac{x-\mu}{\theta}) \implies P\qty(Z< \frac{x-\mu}{\theta}) = \Phi\qty(\frac{x-\mu}{\theta}) \end{equation}

normal maximizes entropy

no other random variable uses as little parameters to convey as much information

approximation of binomial distribution with normal distribution

You can use a normal distribution to approximate binomial approximation. However, be aware of a continuity correction

adding Gaussian distributions

for independent:

\begin{equation} X+Y \sim \mathcal{N}(\mu_{1}+\mu_{2}, \sigma_{1}^{2}+\sigma_{2}^{2}) \end{equation}