_index.org

direct estimation

Last edited: August 8, 2025

direct estimation of the probability of failure:

  1. perform a rollout of the system
  2. label the outcome as \(1\) if the trajectory is a failure, and \(0\) otherwise

this is just Direct Sampling.

From there, we can just go about estimating this using standard parameter estimation (i.e. using MLE estimation or Baysian estimation.)

maximum-likelihood estimation of failure distribution

\begin{equation} \hat{p}_{\text{fail}} = \frac{1}{m} \sum_{i=1}^{m} 1\qty {\tau_{i} \not \in \psi} = \frac{n}{m} \end{equation}

for \(n\) failures and \(m\) rollouts, where \(\tau \sim p\qty(\cdot)\).

Direct Sampling

Last edited: August 8, 2025

Direct Sampling is the act in probability to sample what you want from the distribution. This is often used when actual inference impossible. It involves. well. sampling from the distribution to compute a conditional probability that you want.

It basically involves invoking the Frequentist Definition of Probability without letting \(n \to \infty\), instead just sampling some \(n < \infty\) and dividing the event space by your sample space.

So, for instance, to compute inference on \(b^{1}\) given observations \(d^{1}c^{1}\), we can write:

direct sum

Last edited: August 8, 2025

A direct sum is a sum of subspaces (not just subsets!!) where there’s only one way to represent each element.

constituents

subspaces of \(V\) named \(U_1, \dots, U_{m}\)

requirements

The sum of subsets of \(U_1+\dots+U_{m}\) is called a direct sum IFF:

each element in \(U_1+\dots +U_{m}\) can only be written in one way as a sum \(u_1 +\dots +u_{m}\) (as in, they are linearly independent?)

We use \(\oplus\) to represent direct sum.

additional information

why is it called a direct sum?

Something is not a direct sum if any of its components can be described using the others. Its kind of line linear independence but! on entire spaces.

Directed Evolution

Last edited: August 8, 2025

Directed Evolution is a process of recreating Darwinian processes in a lab setting

  1. mutation: make select mutation
  2. selection: selection specific changes
  3. replication: make more of it

Examples: PACE

Directed Exploration

Last edited: August 8, 2025

Softmax Method

Pull arm \(a\) with probability \(\propto \exp (\lambda \rho_{a})\), where \(\lambda \geq 0\) is the “precision parameter”.

When \(\lambda \to 0\), this system uses the same rate for each of the actions, so you are essentially randomly sampling; when \(\lambda \to \infty\), the system will use only the greedy action because only the element with the biggest \(\rho_{a}\) gets selected.

For a multi-state case:

\begin{equation} \propto \exp (\lambda Q(s,a)) \end{equation}