G-DICE

Last edited: August 8, 2025

Motivation

Its the same. It hasn’t changed: curses of dimensionality and history.

Goal: to solve decentralized multi-agent MDPs.

Key Insights

macro-actions (MAs) to reduce computational complexity (like hierarchical planning)
uses cross entropy to make infinite horizon problem tractable

Prior Approaches

masked Monte Carlo search: heuristic based, no optimality garantees
MCTS: poor performance

Direct Cross Entropy

G-DICE

create a graph with exogenous \(N\) nodes, and \(O\) outgoing edges (designed before)
use Direct Cross Entropy to solve for the best policy

Results

demonstrates improved performance over MMCS and MCTS
does not need robot communication
garantees convergence for both finite and infiinte horizon
can choose exogenous number of nodes in order to gain computational savings

Galactica

Last edited: August 8, 2025

Galactica is a large-languange model for generating research papers, made by meta research

Galton Board

Last edited: August 8, 2025

One of these things. It is actually a binomial distribution.

You can phrase the probability at

GAMMA

Last edited: August 8, 2025

Past Work

self play: this is a \(\text{coNP}\) vs \(\text{NP}\) problem: whereas competitive self-play attempts to defend against all strategies, collaborative self-play only needs to find one useful strategy; this doesn’t generalize well because humans are not a partner
behavior cloning:
Population Based Training: computational super e

Novelty

instead, learn a generative model from both simulated agents or human data
then, sample from this generative model

Notable Methods

Key Figs

New Concepts

Notes

GARCH

Last edited: August 8, 2025

The GARCH model is a model for the heteroskedastic variations where the changes in variance is assumed to be auto correlated: that, though the variance changes, it changes in a predictable manner.

It is especially useful to

GARCH 1,1

Conditional mean:

\begin{equation} y_{t} = x’_{t} \theta + \epsilon_{t} \end{equation}

Then, the epsilon parameter:

\begin{equation} \epsilon_{t} = \sigma_{t}z_{t} \end{equation}

where:

\begin{equation} z_{t} \sim \mathcal{N}(0,1) \end{equation}

and:

conditional variance

\begin{equation} {\sigma_{t}}^{2} = \omega + \lambda {\sigma_{t-1}}^{2} + \beta {\sigma_{t-1}}^{2} \end{equation}