Posts

Polio

Last edited: August 8, 2025

polynomial

Last edited: August 8, 2025

A polynomial is a polynomial

constituents

  • a function \(p: \mathbb{F} \to \mathbb{F}\)
  • coefficient \(a_0, \dots, a_{m} \in \mathbb{F}\)

requirements

A polynomial is defined by:

\begin{equation} p(z)=a_0+a_1z+a_2z^{2}+\dots +a_{m}z^{m} \end{equation}

for all \(z \in \mathbb{F}\)

additional information

degree of a polynomial \(\deg p\)

A polynomial’s degree is the value of the highest non-zero exponent. That is, for a polynomial:

\begin{equation} p(z) = a_0+a_1z+\dots +a_{m}z^{m} \end{equation}

with \(a_{m} \neq 0\), the degree of it is \(m\). We write \(\deg p = m\).

polynomial hierarchy

Last edited: August 8, 2025

\begin{equation} \text{PH} = \bigcup_{c \in \mathbb{N}} \Sigma_{c} = \bigcup_{c \in \mathbb{N}} \pi_{c} \end{equation}

“a language is in the polynomial hierarchy if it can be described with a constant number of qualifiers”

results and conjectures

polynomial hierarchy conjecture

“the polynomial hierarchy is infinite”—that is, each arrow to a harder language is strict.

\(P \neq NP\), \(NP \neq \pi_{2}\)

PSPACE bounds the entire hierarchy

\(\text{PH} \subseteq \text{PSPACE}\)

because for every \(\exists\) you only have to keep one of it around.

polynomial interpolation

Last edited: August 8, 2025

constituents

\(m\) data points \(\qty(x_i,y_{i})\)

requirements

we desire \(c_{j}\) such that:

\begin{equation} y = c_1 + c_{2} x + c_3 x^{2} + \dots \end{equation}

Given our set of basis functions \(\phi_{j}(x)\) for input \(x\), our goal is:

\begin{equation} y = c_1 \phi_{1} + c_2 \phi_{2} + \dots + c_{n}\phi_{n} \end{equation}

the \(\phi\) are the model function which determines our neural networks.

additional information

Monomial basis and vandermonde Matrix

to do this, we put stuff in matrix form following forms, called the matrix of monomial basis:

POMCP

Last edited: August 8, 2025

Previous monte-carlo tree search methods which are not competitive to PBVI, SARSOP, etc., but those are affected by close-up history.

key point: monte-cargo roll outs best-first tree search + unweighted particle filter (instead of categorical beliefs)

Background

  • History: a trajectory of some \(h = \{a_1, o_1, …\}\)
  • generative model: we perform a random sample of possible next state (weighted by the action you took, meaning an instantiation of \(s’ \sim T(\cdot | s,a)\)) and reward \(R(s,a)\) from current state
  • Rollout: keep sampling at each point, rolling out and calculating future reward

monte-carlo tree search

  1. loop:
    1. sample \(s\) from the belief distribution \(B(h)\) for each node and call that the node state
    2. loop until we reach a leaf:
      1. sample exploratino using UCB 1 via the belief
      2. get observation, reward, next state
    3. add leaf node, add node for each available action
    4. Rollout
    5. backpropegate the obtained value with discounts backwards via POMDP Bellman Backup

During runtime, we choose the action with the best action, prune the tree given what you observed, and do this again in a different.