Maximum Likelihood Estimation with Convex Optimization

motivation

parametric distribution estimation: suppose you have a family of densities \(p_{x}\qty(y)\), with parameter \(x\)
we take \(p_{x}\qty(y) = 0\) for invalid values of \(x\)

maximum likelihood estimation: choose \(x\) to maximize \(p_{x}\qty(y)\) given some dataset \(y\).

Suppose you have some kind of linear noise model:

\begin{equation} y_{i} = a_{i}^{T}x + v_{i} \end{equation}

where \(v_{i}\) is IID noise, and \(a^{T}_{i}\) is the model. We can write \(y\) probabilistically as:

\begin{equation} p_{x}\qty(y) = \prod_{i=1}^{m} p\qty(y_{i} - a_{i}^{T}x) \end{equation}

for some model \(p\) of noise \(v\). Thus the noise-aware parameter estimation is:

\begin{align} \min_{x}\quad & \sum_{i=1}^{m} \log p\qty(y_{i} - a_{i}^{T}x) \end{align}

with observed \(y\) and model \(a\).

Random variables \(y \in \qty {0,1}\) with distribution:

\begin{equation} p = \frac{\exp \qty(a^{T}u + b)}{1 + \exp \qty(a^{T}u + b)} \end{equation}

The maximization of this is also a concave problem.