ICLR2025 MoE

Last edited: August 8, 2025

Talks

ICLR2025 Neitemeier: Hierachical Autoregressive Transformers

Last edited: August 8, 2025

“A Byte Level transformer, with some compression”

Key insight: use a [CLS] token in front of every word to train a small “tokenizer”, and then do a normal transformer on the [CLS] tokens, and then autoregressive decode out the single bytes.

Method

Hierarchical Autoregressive Transformers

We put a [cls] in front of every word. So the input looks like

[CLS] M y _ [CLS] n a m e _ [CLS] i s

We then run a small encoder over each sequence. And then you take the encoded [CLS], and run

ICLR2025 Saturday Posters

Last edited: August 8, 2025

ICLR2025 Cassidy: AssistanceZero

Train reward predictor to also have rewards at test time
MCTS
Learn to match root node KL

ICLR2025 Liu: synthesizing programmatic reinforcement learning policies with LLM guided search

Hill climbing with partial mutations of generated programs of LLMs

ICLR2025 Weller: l PromptTrirver

ICLR2025 Yu: robust LLM safeguard via refusal feature adversarial training

With mechanistic interpretability, we can find a sub space which is correlated with refusal, pull that up

ICLR2025 Snell: Optimality of Scaling LLM Test-Time Compute

Last edited: August 8, 2025

Compute-Optimal Scaling

Compute-Optimal Scaling is the notion of selecting the optimal configuration (beam width, search budget, etc.) dynamically / for binned question.

Approaches to “Scaling Test-Time Compute”

Three primary approaches:

best-of-n: roll out a bunch, reject
Beam Search: check against intermediate
lookahead search: MCTSish (do lookahead rollouts)

Key insight

On easy qusetion, beam search shows over-optimization and best of n is good
on medium/hard questions, beam search is better

Lookahead seems bad?

ICLR2025 Thursday Morning Posters

Last edited: August 8, 2025

ICLR2025 Hu: belief state transformer

Key insight: residual stream at the last token kept thought of as a belief state encoding future tokens, that is, uncertainty in the last residual directly correlate the diversity of output

Method: trainer transformer and trainer reverse transformer like what Robert wanted, then correlate

ICLR2025 Lingam: diversity of thoughts

Key insight: Use iterative sampling to achieve higher diversity in self reflection, in order to get better outputs.