ragdoll scratchpad

\begin{align} \tau = s_1 \dots s_{n} \end{align}

\begin{equation} a \sim \pi_{\text{lm}}\qty(\tau) \end{equation}

\begin{equation} a_{i}, \rho_{i} \sim \pi_{\text{lm}}\qty(\tau \mid \rho_{i-1} \dots \rho_{1}) \end{equation}

“action selection”


Tracking in \(\rho\)

  1. Things that can go into \(\rho\) (explanation “why did we do this”)
  2. Running summary of the entire \(\tau\) <- good chance this avoids the entire \(\tau\)

why don’t se


Threads

  • embeddings being bad seems weird