_index.org

How Did Economists Get It So Wrong?

Last edited: August 8, 2025

A reading: (Krugman 2009)

Reflection

The discussion here of the conflict between “saltwater” and “freshwater” (Keynesian and Neoclassical) economists is very interesting when evaluated from the perspective of our recent impending recession.

One particular statement that resonated with me in the essay was the fact that a crisis simply “pushed the freshwater economists into further absurdity.” It is interesting to see that, once a theory has been well-established and insulated in a community, it becomes much more difficult to parcel out as something that could be wrong.

hsbi

Last edited: August 8, 2025

HSVI

Last edited: August 8, 2025

Improving PBVI without sacrificing quality.

Initialization

We first initialize HSVI with a set of alpha vectors \(\Gamma\), representing the lower-bound, and a list of tuples of \((b, U(b))\) named \(\Upsilon\), representing the upper-bound. We call the value functions they generate as \(\bar{V}\) and \(\underline V\).

Lower Bound

Set of alpha vectors: best-action worst-state (HSVI1), blind lower bound (HSVI2)

Calculating \(\underline{V}(b)\)

\begin{equation} \underline{V}_{\Gamma} = \max_{\alpha} \alpha^{\top}b \end{equation}

Upper Bound

Fast Informed Bound

  • solving fully-observable MDP
  • Project \(b\) into the point-set
  • Projected the upper bound onto a convex hull (HSVI2: via approximate convex hull projection)

Calculating \(\bar{V}(b)\)

Recall that though the lower-bound is given by alpha vectors, the upper bound is given in terms of a series of tuples \((b, U(b)) \in \Upsilon\).

Hungarian Method

Last edited: August 8, 2025

HybPlan

Last edited: August 8, 2025

“Can we come up a policy that, if not fast, at least reach the goal!

Background

Stochastic Shortest-Path

we are at an initial state, and we have a series of goal states, and we want to reach to the goal states.

We can solve this just by:

Problem

MDP + Goal States

  • \(S\): set of states
  • \(A\): actions
  • \(P(s’|s,a)\): transition
  • \(C\): reward
  • \(G\): absorbing goal states

Approach

Combining LRTDP with anytime dynamics