## Language Model

A Language Model is a large neural network trained to predict the **next token** given some context.

“Language models can discriminate behavior that they can’t reliably generate.”

## Coherence

**Generative REVOLUTION**

### Why probability maximization sucks

Its expensive!

### Beam Search

- Take \(k\) candidates
- Expand \(k\) expansions for each of the \(k\) candidates
- Choose the highest probability \(k\) candidates

\(k\) should be small: trying to maximizing

### Branch and Bound

See Branch and Bound

### Challenges of Direct Sampling

Direct Sampling sucks. Its sucks. It sucks. Just sampling from the distribution sucks. This has to do with the fact that assigning slightly lower scores “being less confident” is exponentially worse.

The model has to therefore be VERY conservative about giving low confidences; so, it is over confident about worst tokens.

### Top-K

Top-k is too broad, and top

### Nucleaus Sampling

Find the smallest set of tokens that make up to \(p\) probability.

## Correctness

- The highest probability answer isn’t always right
- Generative models consider every answer, so we want another model to compute the correct answer

### Surface Form Competition

The Surface Form Competition problem results when top probabity token “steals” probability from the other tokens.

The predicted frequency of a possible string is a main comfounder. And so we can use models to decompose their own predictions:

Turns out:

\(P(answer|question) \approx P(answer\ is\ valid)P(answer|domain)\)

So…

\begin{equation} P(answer\ is\ valid) = \frac{P(answer|question)}{P(answer|domain)} \end{equation}

This is better :point_up:. Futher reading: (Holtzman et al. 2021)

#### Domain

Domain is the context in which that the text may occur.

## Coverage

Why aren’t models controllable

### Hallucination

- Language models predict what’s most likely
- We hope to control them with natural-language semantics

### In-Context Learning

If we show the model some context which has example input output pairs, it can output. (Language Model model are few shot learners)

#### Correct Scoring

We can reverse the output to predict the input to prevent model from loosing information, and use that to rerank the info. Of course, if the model can’t generate the desired input, the output is probably missing information.

Smaller models can be made better because of info reranking.

Th Degenerative Discriminative Gap.

## Future Work

The fact that the single comma shift the input. What we need is a language to control language behavior.

**The Ability to Control a Model are the Goal of Understand the Model**

We should only claim to understand a model when we can make a theory map about it: “when X is fed into the model, we get Y”

### So:

- we should look at what the model is biased about (Surface Form Competition, for instance)
- we would be closer to prime behaviors such that they mimic the human behavior (in pieces, not just “complete these tokens”) in completion
- We see success as the actual evaluation metrics; we can use machines vs. other machines as the the results

## Questions

Marcel Just

anthropic ai papers

**percy liang**