Google Nerd Snipe
Last edited: August 8, 2025a:2:{i:0;s:2:“f2”;i:1;s:2:“f3”;}

a:2:{i:0;s:2:“e2”;i:1;s:2:“e3”;}

a:2:{i:0;s:2:“e1”;i:1;s:2:“e2”;}

a:2:{i:0;s:2:“b2”;i:1;s:2:“b3”;}
a:2:{i:0;s:2:“c2”;i:1;s:2:“d8”;}
gorup
Last edited: August 8, 2025gradient descent
Last edited: August 8, 2025It’s hard to make globally optimal solution, so therefore we instead make local progress.
constituents
- parameters \(\theta\)
- step size \(\alpha\)
- cost function \(J\) (and its derivative \(J’\))
requirements
let \(\theta^{(0)} = 0\) (or a random point), and then:
\begin{equation} \theta^{(t+1)} = \theta^{(t)} - \alpha J’\qty (\theta^{(t)}) \end{equation}
“update the weight by taking a step in the opposite direction of the gradient by weight”. We stop, btw, when its “good enough” because the training data noise is so much that like a little bit non-convergent optimization its fine.
Gram-Schmidt
Last edited: August 8, 2025OMG its Gram-Schmidtting!!! Ok so like orthonormal basis are so nice, don’t you want to make them out of boring-ass normal basis? Of course you do.
Suppose \(v_1, … v_{m}\) is a linearly independent list in \(V\). Now let us define some \(e_{1} … e_{m}\) using the procedure below such that \(e_{j}\) are orthonormal and, importantly:
\begin{equation} span(v_1, \dots, v_{m}) = span(e_{1}, \dots, e_{m}) \end{equation}
The Procedure
We do this process inductively. Let:
\begin{equation} e_1 = \frac{v_1}{\|v_1\|} \end{equation}
grammar
Last edited: August 8, 2025A grammar is a set of logical rules that form a language. (more precisely defined in goals of a grammar)
goals of a grammar
- explain natural languages in syntax + semantics
- have described algebras which can be used to evolve the syntax
- …that describe the grammatical operations
The formalism here is that a rigorous grammar should have: