Main problem: joint actions and observations are exponential by the number of agents.

Solution: **Smaple-based online planning** for multiagent systems. We do this with the factored-value POMCP.

**factored statistics**: reduces the number of joint actions (through action selection statistics)**factored trees**: reduces the number of histories

## Multiagent Definition

- \(I\) set of agents
- \(S\) set of states
- \(A_{i}\) set of states for each agent \(i\)
- \(T\) state transitions
- \(R\) reward function
- \(Z_{i}\) joint observations for each agents
- \(O\) set of observations

## Coordination Graphs

you can use sum-product elimination to shorten the Baysian Network of the agent Coordination Graphs (which is how agents influnece each other).

## Mixture of Experts

Directly search for the best joint actions; computed by MLE of the total value.