explainability
Last edited: August 8, 2025Explainability is the study of, when stuff breaks, understanding why it does.
Here are a set of explainability techniques!
policy visualization
Roll your system out and look at it
Some common strategies that people use to do this:
- plot the policy: look at what the agent says to do at each state (if you have too many dimensions, just plot slices!)
- slicing: one way to deal with history-dependent trajectories is to then just count the number of actions that your system takes at each step, and plot the argmax of it
feature importance
Our goal is still to understand the contribution of various features to the overall behavior of a system.
explicit programming
Last edited: August 8, 2025- anticipate all states that the agent may find itself in
- hard-code responses to each one
This is bad because you have to have big brain to think about and anticipate all the possible states (to provide a “complete strategy”), which is often impractical if not impossible.
Disadvantages
You have to know the finite possible state space, and solve them “correctly”
Exploration and Exploitation
Last edited: August 8, 2025You are the president, and you are trying to choose the secretary of state. You can only interview people in sequence, and you have to hire on the spot. There are a known number of candidates. We want to maximize the probability of selecting the best candidate. You are given no priors.
How do we know which candidates we explore, and which candidates we exploit?
Sometimes, you don’t have a way of getting data.
exponential distribution
Last edited: August 8, 2025Analogous to poisson distribution, but for continuous random variable. Consider a distribution which lasts a duration of time until success; what’s the probability that success is found in some range of times:
“What’s the probability that there are an earthquake in \(k\) years if there’s on average \(2\) earthquakes in 1 year?”
constituents
- $λ$—“rate”: event rate (mean occurrence per time)
requirements
\begin{equation} f(x) = \begin{cases} \lambda e^{-\lambda x}, x\geq 0\\ 0, x< 0 \end{cases} \end{equation}
Extended Church-Turing Thesis
Last edited: August 8, 2025A Turing Machine can simulate every “reasonable” model of computation with only Polynomial Time increase in time complexity—possibly the “worse possible”
This is only a thesis! There’s a chance, for instance, randomized quantum algorithms may change this.
see also: Church-Turing thesis as local steps