NP-Complete
Last edited: November 11, 2025A language \(B\) is NP-Complete if \(B \in NP\) and we have that every \(A \in NP\) has \(A \leq_{P} B\) with a polynomial time mapping reduction. We say \(B\) is NP-Hard if the reduction exists, and NP-Complete if \(B \in NP\) too.
Suppose a language \(L\) is NP-Complete, we have that every other language in \(NP\) is mapping reducable to \(L\). So, if \(L \in P\), then \(P= NP\), if \(L \not\in P\), then \(P \neq NP\).
Principle Component Analysis
Last edited: November 11, 2025assumptions of PCA
- this makes sense only when the data is linear
- large variations have important structure (i.e. not noise)
motivation
Consider dataset \(\qty {x^{(i)}}_{i=1}^{n}\) such that \(x^{(i)} \in \mathbb{R}^{d}\), where \(d < n\). Our goal: we want to automatically identify such correlations between axes of data. This boils down to two goals:
- automatically detect and remove redudancies
- find features that explain the variation
preprocessing before PCA
normalize: \(x_{j}^{(i)} = \frac{x^{(i)}_{j} - \mu_{j}}{\sigma_{j}}\)
ragdoll scratchpad
Last edited: November 11, 2025\begin{align} \tau = s_1 \dots s_{n} \end{align}
\begin{equation} a \sim \pi_{\text{lm}}\qty(\tau) \end{equation}
\begin{equation} a_{i}, \rho_{i} \sim \pi_{\text{lm}}\qty(\tau \mid \rho_{i-1} \dots \rho_{1}) \end{equation}
“action selection”
Tracking in \(\rho\)
- Things that can go into \(\rho\) (explanation “why did we do this”)
- Running summary of the entire \(\tau\) <- good chance this avoids the entire \(\tau\)
why don’t se
Threads
- embeddings being bad seems weird
randomness
Last edited: November 11, 2025randomness is a resource which can be used.
Our unit of computation is a randomized turing machine
some questions of randomness
save time/space by using randomness: we used to believe that \(P \subset \text{Randomized } P\), \(L \subset \text{Randomized } L\), etc. But we now think \(P = \text{Randomized } P\) (intuition: if solving SAT requires circuits of exponential size, then \(P\) is equal to Randomized \(P\)).
efficient derandomized: can we reduce usage of randomization—i.e. completely derandomize it while not loosing a lot of efficiency?
singular value decomposition
Last edited: November 11, 2025Singular value decomposition is a factorization of a matrix, which is a generalization of the eigendecomposition of normal matricies (i.e. where \(A = V^{-1} D V\) when \(A\) is diagonalizable, i.e. by the spectral theorem possible when matricies are normal).
Definitions
Singular value decomposition Every \(m \times n\) matrix has a factorization of the form:
\begin{equation} M = U D^{\frac{1}{2}} V^{*} \end{equation}
where, \(U\) is an unitary matrix, \(D^{\frac{1}{2}}\) a diagonalish (i.e. rectangular diagonal) matrix with non-negative numbers on its diagonal called singular values, which are the positive square roots of eigenvalues of \(M^{* }M\) — meaning the diagonal of \(D^{\frac{1}{2}}\) is non-negative (\(\geq 0\)). Finally, \(V\) is formed columns of orthonormal bases of eigenvectors of \(M^{*}M\).
