Machine Learning Index
Last edited: October 10, 2025Course Project
- Deliverables: proposal (300-500 words), milestone (3 pages), final report (5 pages), poster
- Evaluation: technical quality, originality, community
Lectures
Basics + Linear Methods
Regularization
Kernel Methods
Decision Trees and Boosting
Deep Learning
neural network
Last edited: October 10, 2025Neural networks are a non-linear learning architecture that involves a combination of matrix multiplication and entry-wise non-linear operations.
two layers
constituents
Consider a two layer neural network with:
- \(m\) hidden units
- \(d\) dimensional input \(x \in \mathbb{R}^{d}\)
requirements
\begin{align} &\forall j \in \qty {1, \dots, m}\\ &z_{j} = w_{j}^{(1)}^{T} x + b_{j}^{(1)}\\ &a_{j} = \text{ReLU}\qty(z_{j}) \\ &a = \qty(a_1, \dots, a_{m})^{T} \in \mathbb{R}^{m} \\ &h_{\theta} \qty(x) = w^{(2)}^{T} a + b^{(2)} \end{align}
sponsorship
Last edited: October 10, 2025stochastic gradient descent
Last edited: October 10, 2025gradient descent makes a pass over all points to make one gradient step. We can instead approximate gradients on a minibatch of data. This is the idea behind stochastic-gradient-descent.
\begin{equation} \theta^{t+1} = \theta^{t} - \eta \nabla_{\theta} L(f_{\theta}(x), y) \end{equation}
this terminates when theta differences becomes small, or when progress halts: like when \(\theta\) begins going up instead.
we update the weights in SGD by taking a single random sample and moving weights to that direction.
strongly connected components
Last edited: October 10, 2025strongly connected components expose local communities in a graph.
constituents
graph \(V,E\)
requirements
strongly connected: for all \(v,w \in V\), there is a path from \(v \to w\), and there’s a path for \(w \to v\).
We can decompose a graph into strongly connected components where a subgraph is strongly connected. (i.e. they form equivalence class under “is strongly connected.”)
additional information
Kosaraju’s Algorithm
A way to find strongly connected components in linear time \(O\qty(n+m)\).
