Chain of Thought
Last edited: August 8, 2025Challenges of Language Model Agents
Last edited: August 8, 2025Challenge of Making Agents
Agents are not very new—(Riedl and Amant 2002). But newer models can be powered by LLM/VLMs, meaning we are using language for reasoning/communication.
Sequentiality is hard
- what is the context/motivation?
- how to you transfer across contexts?
- how do you plan?
Evaluation
- Different from how previous NLP benchmarks: we are not worried about language modeling
- No longer boundaries between various fields
Common goals:
- realistic agents—stop playing Atari games.
- reproducible systems
- measurability goals
- scalable models
- which are easy to use
Web as an Interactive Environment
- agents on the web is both practical and scalable
- https://webshop-pnlp.github.io/
- WebShop can actually transfer with no work to training on Amazon
- Mind2Web
InterCode
Formulation of agent decisions as POMDP in order to fully benchmark Markovian decisions:
changes to central dogma
Last edited: August 8, 2025- 80% of the human genome is actually transcribed
- very little “junk DNA”
- 40% IncRNA are gene specific
characteristic polynomial
Last edited: August 8, 2025The polynomial given by the determinant of:
\begin{equation} det(A-\lambda I) \end{equation}
for some Linear Map \(A\). Solutions for \(\lambda\) are the eigenvalues. This is because something is an eigenvalue IFF \((A-\lambda I)v = 0\) for some \(\lambda, v\), so we need \((A-\lambda I)\) to be singular.
Characteristic polynomial of a 2x2 matrix is given by \(\lambda^{2}-tr(A)\lambda + det(A)\).