Houjun Liu

Bouton 2018

(Bouton et al. 2018)


Uses the single-user avoidance POMDP formulation presented in (Bouton, Cosgun, and Kochenderfer 2017) to extend to multiple road users


Uses Single-User Model of Road Navigation to extend general POMDP formulation into multi-pedestrian/multi road user casesroad user cases

Previous Work

Imagine worst-case scenario always: set upper bound and always imagine it; could cause gridlock if situation never resolves.

Notable Methods

Uses QMDP and SARSOP to perform optimization

Single-User Model of Road Navigation

See Single-User Model of Road Navigation

Scaling to multiple road users

  • make an aggregate utility which is a function across all the single-user avoidance strategies (i.e. the aggregate utiltiy of mulitlpe road user is the utility of avoiding each individual user) \(U^{*}(b,a) = f(U^{*}(b_1, a) … U^{*}(b_{n}, a)\). this is called utility fusion
  • two possible approaches: either minimum of all the utilities, or the sum of them; the former is more risk averse (we want to hit no one), and latter treats each user is independent.
  • further, the number of users in the road is modeled by a belief


“the evaluation models are different to find the optimal policy, and are also higher fidelity”

We want to evaluate our POMDP on a higher fidelity model to check if the system can generalize to harder environments.

Baselines: random actions, or hand crafted rules-based policy.

Key Figs

New Concepts

Single-User Model of Road Navigation

POMDP formulation; we only care about one road user

  1. action: a finite set of change in acceleration -4m/s2, -2m/s2, 0m/s2, 2m/s2, 4m/s2
  2. states and transitions: poses (position + velocity) of the car and the road user; position are velocities are discretized
  3. observation: measured position and velocity of the one other road user with a pm 1 meter variance for crosswalks and pm 2 meter variance for intersection
    • users in non-occluded area will always be detected
    • user in an occluded area will not be detected
    • position and velocity of road users are uncertain pm 1 meter and pm 1 meter / second
  4. belief: categorical distribution over states
  5. dynamics: physics + kinematics for car; pedestrians have stochastic velocity
  6. reward: unit reward for final position, tuned penalty for collision