Words to Concepts
Last edited: January 1, 2026Chen’s Talk.
constructive convexity verification
Last edited: January 1, 2026- start with function \(f\) gives as expression
- build parse tree for expression (leaves and variables / constants, nodes are functions of child expressions)
- apply general composition rule that preserve convexity
Greedy parses may fail, such as in the case of logsumexp.
Euclidian Geometry Crash Course
Last edited: January 1, 2026line
All points of the form \(x = \theta x_{1} + \qty(1-\theta) x_{2}\), with \(\theta \in \mathbb{R}\) is a “line through \(x_1\), \(x_2\)”.
affine set
For set \(G\), for all two points \(x_1, x_2 \in G\), all points lying on the line \(x_1, x_2 \in G\). For instance, the solution set of a set of linear equations \(\qty {x \mid A x = b}\).
convex set
line segment
all points form \(x = \theta x_{1} + \qty(1-\theta)x_{2}\), with \(0 \leq \theta \leq 1\).
expectation maximization
Last edited: January 1, 2026Sorta like “distribution-based k-means clustering”. guarantees convergence (i.e. each parameter will converge to the maximum possible parameter).
constituents
requirements
Two steps:
e-step
“guess the value of \(z^{(i)}\); soft guesses of cluster assignments”
\begin{align} w_{j}^{(i)} &= p\qty(z^{(i)} = j | x^{(i)} ; \phi, \mu, \Sigma) \\ &= \frac{p\qty(x^{(i)} | z^{(i)}=j) p\qty(z^{(i)}=j))}{\sum_{l=1}^{k}p\qty(x^{(i)} | z^{(i)}=l) p\qty(z^{(i)}=l))} \end{align}
Where we have:
- \(p\qty(x^{(i)} |z^{(i)}=j)\) from the Gaussian distribution, where we have \(\Sigma_{j}\) and \(\mu_{j}\) for the parameters of our Gaussian \(j\).
- \(p\qty(z^{(i)} =j)\) is just \(\phi_{j}\) which we are learning
These weights \(w_{j}\) are how much the model believes it belongs to each cluster.
Jensen's Inequality
Last edited: January 1, 2026linear edition
if \(f\) is convex, then for \(x,y \in \text{dom }f, 0 \leq \theta \leq 1\), then:
\begin{equation} f\qty(\theta x + \qty(1-\theta) y) \leq \theta f\qty(x) + \qty(1-\theta) f\qty(y) \end{equation}
probabilistic extension
Let \(f\) be a convex function; that is, \(f’’\qty(x) \geq 0\); let \(x\) be a random variable. Then, \(f\qty(\mathbb{E}[x]) \leq \mathbb{E}\qty [f\qty(x)]\).
Further, if \(f\) is strictly convex, that is \(f’’\qty(x) > 0\), then \(\mathbb{E}\qty [f\qty(x)] = f\qty(\mathbb{E}[x])\), that is, \(x\) is constant.
