_index.org

longest common subsequence

Last edited: November 11, 2025

BDFH is a subsequence of ABCDEFGH. Given two lists, what’s the longest common subsequence?

optimal substructure

We consider sub problems’ of finding LCS of prefixes \(C[i,j]\). Casework:

C[i,j] = …

  • for A[i] = B[j], then C[i,j] = C[i-1, j-1] + 1
  • for A[i] != B[j] = max(C[i,j-1], C[i-1,j])
  • if i=0 or j=0, then 0

Filling this table is \(O\qty(nm)\), and backtracing takes \(O\qty(n+m)\).

Machine Learning Index

Last edited: November 11, 2025

https://cs229.stanford.edu/

Course Project

  • Deliverables: proposal (300-500 words), milestone (3 pages), final report (5 pages), poster
  • Evaluation: technical quality, originality, community

SU-CS229 Midterm Sheet

Lectures

Basics + Linear Methods

Regularization

Kernel Methods

Decision Trees and Boosting

Deep Learning

Reinforcement Learning

PCA

NP-Complete

Last edited: November 11, 2025

A language \(B\) is NP-Complete if \(B \in NP\) and we have that every \(A \in NP\) has \(A \leq_{P} B\) with a polynomial time mapping reduction. We say \(B\) is NP-Hard if the reduction exists, and NP-Complete if \(B \in NP\) too.

Suppose a language \(L\) is NP-Complete, we have that every other language in \(NP\) is mapping reducable to \(L\). So, if \(L \in P\), then \(P= NP\), if \(L \not\in P\), then \(P \neq NP\).

Principle Component Analysis

Last edited: November 11, 2025

assumptions of PCA

  • this makes sense only when the data is linear
  • large variations have important structure (i.e. not noise)

motivation

Consider dataset \(\qty {x^{(i)}}_{i=1}^{n}\) such that \(x^{(i)} \in \mathbb{R}^{d}\), where \(d < n\). Our goal: we want to automatically identify such correlations between axes of data. This boils down to two goals:

  1. automatically detect and remove redudancies
  2. find features that explain the variation

preprocessing before PCA

normalize: \(x_{j}^{(i)} = \frac{x^{(i)}_{j} - \mu_{j}}{\sigma_{j}}\)

ragdoll scratchpad

Last edited: November 11, 2025

\begin{align} \tau = s_1 \dots s_{n} \end{align}

\begin{equation} a \sim \pi_{\text{lm}}\qty(\tau) \end{equation}

\begin{equation} a_{i}, \rho_{i} \sim \pi_{\text{lm}}\qty(\tau \mid \rho_{i-1} \dots \rho_{1}) \end{equation}

“action selection”


Tracking in \(\rho\)

  1. Things that can go into \(\rho\) (explanation “why did we do this”)
  2. Running summary of the entire \(\tau\) <- good chance this avoids the entire \(\tau\)

why don’t se


Threads

  • embeddings being bad seems weird