SU-CS161 OCT302025
Last edited: October 10, 2025Key Sequence
Notation
New Concepts
Important Results / Claims
Questions
Interesting Factoids
SU-CS161 Things to Review
Last edited: October 10, 2025log laws, exponent laws and general non-discrete math stuff
distributions and infinite series, math53 content
combinations
- \(\mqty(n \\ k) = \mqty(n-1 \\ k-1) + \mqty(n-1 \\ k)\)
- \(\mqty(n \\k) = \mqty(n \\ n-k)\)
binomial theorem: \(\qty(a+b)^{n} = \sum_{k=0}^{n} \mqty(n \\k)a^{k} b^{n-k}\)
SU-CS229 Distribution Sheet
Last edited: October 10, 2025Here’s a bunch of exponential family distributions. Recall:
\begin{equation} p\qty(x;\eta) = b\qty(x) \exp \qty(\eta^{T}T\qty(x) - a\qty(\eta)) \end{equation}
normal, berunouli, posisson, binomial, negative binomial, geometric, chi-squared, exponential are all in
normal distribution
\(\mu\) the mean, \(\sigma\) the variance
\begin{equation} p\qty(x;\mu, \Sigma) = \frac{1}{\qty(2\pi)^{\frac{|x|}{2}} \text{det}\qty(\Sigma)^{\frac{1}{2}}} \exp \qty(-\frac{1}{2} \qty(x-\mu)^{T}\Sigma^{-1}\qty(x-\mu)) \end{equation}
\begin{equation} p\qty(x; \mu, \sigma) = \frac{1}{\sqrt{2\pi\sigma^{2}}} \exp \qty({ \frac{-(x-u)^{2}}{2 \sigma^{2}}}) \end{equation}
\begin{equation} \mathbb{E}[x] = \mu \end{equation}
\begin{equation} \text{Var}\qty [x] = \sigma^{2} \end{equation}
This is exponential family distribution. For \(\sigma^{2} = 1\):
SU-CS229 OCT292025
Last edited: October 10, 2025xavier initialization
Last edited: October 10, 2025An neural network initialization scheme that tries to avoid Vanishing Gradients.
Consider \(Wx\) step in a neural network:
\begin{equation} o_{i} = \sum_{j=1}^{n_{\text{in}}} w_{ij} x_{j} \end{equation}
The variance of this:
\begin{equation} \text{Var}\qty [o_{i}] = n_{\text{in}} \sigma^{2} v^{2} \end{equation}
