Language Model

Last edited: August 8, 2025

A machine learning model: input — last n words, output — probabilist distribution over the next word. An LM predicts this distribution (“what’s the distribution of next word given the previous words):

\begin{equation} W_{n} \sim P(\cdot | w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \end{equation}

By applying the chain rule, we can also think of the language model as assigning a probability to a sequence of words:

\begin{align} P(S) &= P(w^{(t)} | w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \cdot P(w^{(t-1)} | w^{(t-2)}, \dots, w^{(1)}) \dots \\ &= P(w^{(t)}, w^{(t-1)}, w^{(t-2)}, \dots, w^{(1)}) \end{align}

Language Model Agents

Last edited: August 8, 2025

agents that uses the language to act on behave of another person or group.

Challenges

See Challenges of Language Model Agents

Methods

ReAct

See ReAct

Aguvis

Take the AgentNet dataset, and then tune a vison LM to roll out the rest of the sequence of actions given screenshots as input on top of a Qwen base model.

We can also add on top Chain of Thought to get more thinking as well.

Formulations

OSWorld

A unified task setup and evaluation.

laplae

Last edited: August 8, 2025

Latency Numbers

Last edited: August 8, 2025

law of cosines

Last edited: August 8, 2025

the