Posts

SU-CS161 OCT142025

Last edited: October 10, 2025

Kernel Trick

Last edited: October 10, 2025

constituents

  • samples \(x,z \in \mathcal{D}\)
  • \(d\) input dimensions, \(p\) featured dimensions
  • feature map: \(\phi: \mathbb{R}^{d} \to \mathbb{R}^{p}\)
  • kernel: \(k\qty(x,z) = \langle\phi\qty(x), \phi\qty(z)\rangle\)

requirements

train

Compute \(k\qty(x^{(i)}, x^{(j)})\) for all \(i,j \in 1 … n\). Set \(\beta_{j} = 0\) for all \(j\). Iterate:

\begin{equation} \beta_{i} = \beta_{i} + \alpha \qty[y^{(i)} - \sum_{j=1}^{n} \beta_{j} k\qty(x^{(i)}, x^{(j)})] \end{equation}

test

Given \(x\), compute \(\theta^{T} \phi\qty(x)\) such that:

\begin{align} \theta^{T}\phi\qty(x) &= \sum_{i=1}^{n} \beta_{i} \phi\qty(x^{(i)})^{T} \phi\qty(x) \\ &= \sum_{i=1}^{n} \beta k\qty(x^{(i)}, x) \end{align}

positive semi-definite

Last edited: October 10, 2025

a matrix is PSD if all eigenvalue are \(\geq 0\), also meaning \(z^{T}K z \geq 0\) for all \(z\).

SISL Talks

Last edited: October 10, 2025

Francois

  • backpropegation gets worse as time goes on (i.e. local minima); first order system

Streaming Algorithm

Last edited: October 10, 2025

Streaming Algorithms are computation that grow with the size of the input at a “small”(?) rate. The memory of these systems is not large—they grow roughly logarithmically or poly-logorithmically against the size of the system.

Every so often as we process our system, we have to output a symbol that tells us about the stream we saw so far.

streaming vs FA

Streaming Algorithms is a superset of DFAs: things that Streaming Algorithms can’t do can’t also be done with DFAs. Their memory doesn’t grow too large against the sizes of strings.