NLP Semantics Timeline
Last edited: August 8, 2025- 1990 static word embeddings
- 2003 neural language models
- 2008 multi-task learning
- 2015 attention
- 2017 transformer
- 2018 trainable contextual word embeddings + large scale pretraining
- 2019 prompt engineering
Motivating Attention
Given a sequence of embeddings: \(x_1, x_2, …, x_{n}\)
For each \(x_{i}\), the goal of attention is to produce a new embedding of each \(x_{i}\) named \(a_{i}\) based its dot product similarity with all other words that are before it.
Let’s define:
Noam Chomsky
Last edited: August 8, 2025Non-Deterministic Computation
Last edited: August 8, 2025…building blocks of Non-deterministic Turing Machine. Two transition functions:
\begin{equation} \delta_{0}, \delta_{1} : Q \times \Gamma^{k} \to Q \times \Gamma^{k-1} \times \qty {L, R, S}^{k} \end{equation}
At every point, apply both of these separate functions/branch on both. Some sequences lead to \(q_{\text{accept}}\), and some others lead to \(q_{\text{reject}}\).
We accept IFF exists any path accepts => we reject IFF all path rejects.
why NP is awesome
“what a ridiculous model of computation!”
Non-deterministic Finite Automata
Last edited: August 8, 2025NFA is a relaxation of DFA, but which is allowed to make non-deterministic “verified guesses”.
this is basically a DFA, but our new machine accepts a string if there exists some path that reaches some accept state from some start state.
at each state, we can have any number of out arrows for some letter \(\sigma \in \Sigma\), including for the empty string \(\varepsilon\). meaning we can move between states without doing anything.
Non-Deterministic Space
Last edited: August 8, 2025Non-Deterministic Space is like Non-Deterministic Computation but now is space efficient.
requirements
We say \(A \in \text{NSPACE}\qty(s \qty(n))\) if \(\exists\) Non-deterministic Turing Machine \(M\) which decides \(A\) such that \(M\) always uses \(O\qty(s \qty(n))\) space.
additional information
See NL
