HybPlan
Last edited: August 8, 2025“Can we come up a policy that, if not fast, at least reach the goal!”
Background
Stochastic Shortest-Path
we are at an initial state, and we have a series of goal states, and we want to reach to the goal states.
We can solve this just by:
- value iteration
- simulate a trajectory and only updating reachable state: RTDP, LRTDP
- MBP
Problem
MDP + Goal States
- \(S\): set of states
- \(A\): actions
- \(P(s’|s,a)\): transition
- \(C\): reward
- \(G\): absorbing goal states
Approach
Combining LRTDP with anytime dynamics
hypothesis testing
Last edited: August 8, 2025hypothesis testing is the mechanism by which a hypothesis is tested statistically.
The core logic of hypothesis testing: have a metric, do tests, calculate probability that the outcome could have happened given the metric is true.
Examples include
- t-test (for sample means)
- z-test (for sample proportions)
- chi-square test (for sample categories)
Common to all hypothesis tests are the following terms.
null hypothesis
A null hypothesis is a “no difference” hypothesis created as a part of hypothesis testing. It is usually stated as an equality.
IBM704
Last edited: August 8, 2025The IBM704 is the first mass-produced floating point computation computer (“pretty much the only computer that could handle complex math”); 12,000 floating point additions per second. IBM built 123 such machines between 1955-1960.
