Houjun Liu

IS-DESPOT

Motivation

Large crowd navigation with sudden changes: unlikely events are out of likely sample. So, we want to bring in another distribution based on importance and not likelyness.

Goals

DESPOT with Importance Sampling

  1. take our initial belief
  2. sample trajectories according to Importance Sampling distribution
  3. calculate values of those states
  4. obtain value estimate based on weighted average of the values

Importance Sampling of trajectories

We define an importance distribution of some trajectory \(\xi\):

\begin{equation} q(\xi | b,\pi) = q(s_0) \prod_{t=0}^{D} q(s_{t+1}, o_{t+1} | s_{t}, a_{t+1}) \end{equation}

Background

Importance Sampling