Posts

Planning for Learning

Last edited: August 8, 2025

point selection

Last edited: August 8, 2025
  1. we start at an initial belief point
  2. we do a random Rollout to get to the next belief

then collect

Point-Based Value Iteration

Last edited: August 8, 2025

we keep track of a bunch of alpha vectors and belief samples (which we get from point selection):

\begin{equation} \Gamma = \{\alpha_{1}, \dots, \alpha_{m}\} \end{equation}

and

\begin{equation} B = \{b_1, \dots, b_{m}\} \end{equation}

To preserve the lower-boundedness of these alpha vectors, one should seed the alpha vectors via something like blind lower bound

We can estimate our utility function at any belief by looking in the set for the most optimal:

\begin{equation} U^{\Gamma}(b) = \max_{\alpha \in \Gamma} \alpha^{\top}b \end{equation}

pointer

Last edited: August 8, 2025

A pointer is a variable which stores memory addresses. Because there are no pass-by reference, we use pointers to emulate pass by reference: by sharing addresses with other functions.

A pointer can identify a single byte OR some large data structures. We can dynamically allocate pointers, and also identify memory generically without types.

C is always pass-by-copy. Therefore, to pass-by-reference, you basically have to

int x = 2; // declare object
int *xptr = &x; // get location of object (&: address of)

printf("%d\n", *xptr); // dereference the pointer

address operator

You will note, in the line above:

poisson distribution

Last edited: August 8, 2025

Let’s say we want to know what is the chance of having an event occurring \(k\) times in a unit time, on average, this event happens at a rate of \(\lambda\) per unit time.

“What’s the probability that there are \(k\) earthquakes in the 1 year if there’s on average \(2\) earthquakes in 1 year?”

where:

  1. events have to be independent
  2. probability of sucess in each trial doesn’t vary

constituents

  • $λ$—count of events per time
  • \(X \sim Poi(\lambda)\)

requirements

the probability mass function: