Certificates-Based Intepretation of NL

Last edited: August 8, 2025

A language $A$ is in $NL$ if $\exists$ a deterministic Turing Machine $V$ that runs in logspace where $x \in A \Leftrightarrow \exists w \in \qty {0,1}^{\text{poly}\qty(|x|)}$ (if and only if!! same as NP) such that $V \qty(x,w) = 1$, where $x$—the real input $x$ is on input tape one which is read-only, and the witness $w$ is on input tape two which is read-once (because otherwise the same definition is equivalent to $NP$).

Chain of Thought

Last edited: August 8, 2025

Challenges of Language Model Agents

Last edited: August 8, 2025

Challenge of Making Agents

Agents are not very new—(Riedl and Amant 2002). But newer models can be powered by LLM/VLMs, meaning we are using language for reasoning/communication.

Sequentiality is hard

what is the context/motivation?
how to you transfer across contexts?
how do you plan?

Evaluation

Different from how previous NLP benchmarks: we are not worried about language modeling
No longer boundaries between various fields

Common goals:

realistic agents—stop playing Atari games.
reproducible systems
measurability goals
scalable models
which are easy to use

Web as an Interactive Environment

agents on the web is both practical and scalable
https://webshop-pnlp.github.io/
WebShop can actually transfer with no work to training on Amazon
Mind2Web

InterCode

Formulation of agent decisions as POMDP in order to fully benchmark Markovian decisions:

changes to central dogma

Last edited: August 8, 2025

80% of the human genome is actually transcribed
very little “junk DNA”
40% IncRNA are gene specific

char

Last edited: August 8, 2025

char is a character that represents a glypth: