_index.org

Lambek Calculus

Last edited: August 8, 2025

language

Last edited: August 8, 2025

effability

see also language

Language Agents with Karthik

Last edited: August 8, 2025

Transitions

  1. Transition first from rule based learning to statistical learning
  2. Rise of semantic parsing: statistical models of parsing
  3. Then, moving from semantic parsing to large models—putting decision making and language modeling into the same bubble

Importance of LLMs

  • They are simply better at understanding language inputs
  • They can generate structured information (i.e. not just human language, JSONs, etc.)
  • They can perform natural language “reasoning”—not just generate

(and natural language generation, abv)

Language Information Index

Last edited: August 8, 2025

What makes language modeling hard: resolving ambiguity is hard.

“the chef made her duck”

Contents

Basic Text Processing

Edit Distance

DP costs \(O(nm)\), backtrace costs \(O(n+m)\).

Ngrams

Text Classification

Logistic Regression

Information Retrial

Ranked Information Retrial

Vector Semantics

POS and NER

Dialogue Systems

Recommender Systems

Dora

Neural Nets

The Web

Language Model

Last edited: August 8, 2025

A machine learning model: input — last n words, output — probabilist distribution over the next word. An LM predicts this distribution (“what’s the distribution of next word given the previous words):

\begin{equation} W_{n} \sim P(\cdot | w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \end{equation}

By applying the chain rule, we can also think of the language model as assigning a probability to a sequence of words:

\begin{align} P(S) &= P(w^{(t)} | w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \cdot P(w^{(t-1)} | w^{(t-2)}, \dots, w^{(1)}) \dots \\ &= P(w^{(t)}, w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \end{align}