Exponential Family is a family of distributions following exponentials.
constituents
- \(y\) the data
- \(\eta\) the natural parameter — vector or scalar
- \(T\qty(y)\) the “sufficient statistic” (this is usually just \(y\)) — vector or scalar
- \(b\qty(y)\) the base parameter — scalar
- \(a\qty(\eta)\) the log-partition function — scalar
requirements
A class of distributions is in the Exponential Family if it can be written as:
\begin{align} P\qty(y \mid \eta) &= b\qty(y) \exp \qty(\eta^{\top}T\qty(y)-a\qty(\eta)) \\ &= \frac{b\qty(y) \exp \qty(\eta^{\top} T\qty(y))}{e^{a\qty(\eta)}} \end{align}
To show a particular family of distirbutions is an Exponential Family, we fix a choice of \(b, T, a\) and show that varying \(\eta\) gives you the same family.
additional information
properties of exponential family
- MLE wrt \(\eta\) is concave, which means it has a unique maximum; negative log-likelihood function is convex
- \(\mathbb{E}[y | \eta] = \pdv{\eta} a\qty(\eta)\)
- \(\text{Var}[y | \eta] = \pdv[2]{n} a\qty(\eta)\)
motivation
“family”
What is a family of distributions? We can write a set
\begin{equation} S = \qty {\text{Bern}\qty(j) \mid j \in [0.0, 1.0]} \end{equation}
which is a family of Bernoulli distributions. You can also come up with a family for some fixed variance \(\sigma^{2}\), such that:
\begin{equation} S = \qty {\mathcal{N}\qty(i, \sigma^{2}) \mid i \in \mathbb{R}} \end{equation}
example
Bernoulli distribution is in the exponential family
Prove that a Bernoulli distribution is in the exponential family:
\begin{equation} p\qty(y\mid \phi) = \phi^{y} \qty(1-\phi)^{1-y} \end{equation}
is in the exponential family.
\begin{align} \phi^{y}\qty(1-\phi)^{1-y} &= \exp \log \qty(\phi^{y} \qty(1-\phi)^{1-y}) \\ &= \exp \qty(y \log \phi + \qty(1-y) \log\qty(1-\phi)) \\ &= \exp \qty(\qty(\log \frac{\phi}{1-\phi}))y + \log \qty(1-\phi) \end{align}
So we can write:
\begin{equation} \eta = \log \frac{\phi}{1-\phi} \end{equation}
\begin{equation} \phi = \frac{1}{1+e^{-\eta}} \end{equation}
And we can write:
\begin{equation} \begin{cases} a\qty(\eta) = -\log \qty(1-\eta) = \log \qty(1+e^{\eta}) \\ T\qty(y) = y\\ b\qty(y) = 1 \end{cases} \end{equation}
Hence, we can conclude that \(\text{ExpFam}\qty(\eta) = \text{Bern}\qty(\theta)\).
Gaussian distribution
You can try yourself too for fixed \(\sigma=1\). Just factor the quadratic \(\qty(y-\mu)^{2}\) and pattern match:
\begin{equation} \begin{cases} b\qty(y) = \frac{1}{\sqrt{2\pi}} \exp \qty(-\frac{1}{2}y^{2}) \\ \eta = \mu \\ y = T\qty(y) \\ a\qty(\eta) = \frac{1}{2} \mu^{2} \end{cases} \end{equation}