probability density function
Last edited: August 8, 2025PDFs is a function that maps continuous random variables to the corresponding probability.
\begin{equation} P(a < X < b) = \int_{x=a}^{b} f(X=x)\dd{x} \end{equation}
note: \(f\) is no longer in units of probability!!! it is in units of probability scaled by units of \(X\). That is, they are DERIVATIVES of probabilities. That is, the units of \(f\) should be \(\frac{prob}{unit\ X}\). So, it can be greater than \(1\).
We have two important properties:
probability distribution
Last edited: August 8, 2025probability distributions “assigns probability to outcomes”
\(X\) follows distribution \(D\). \(X\) is a “\(D\) random variable”, where \(D\) is some distribution (normal, gaussian, etc.)
syntax: \(X \sim D\).
Each distribution has three properties:
- variables (what is being modeled)
- values (what values can they take on)
- parameters (how many degrees of freedom do we have)
Types of Distribution
discrete distribution
- described by PMF
continuous distribution
- described by PDF
parametrized distribution
We often represent probability distribution using a set of parameters \(\theta_{j}\). For instance, a normal distribution is given by \(\mu\) and \(\sigma\), and a PMF is by the probability mass for each.
probability mass function
Last edited: August 8, 2025PMF is a function that maps possible outcomes of a discrete random variables to the corresponding actual probabilities.
For random variable \(Y\), we have:
\begin{equation} f(k) = P(Y=k) \end{equation}
and \(f\) is a function that is the PMF, which is the mapping between a random variable and a value it takes on to the probability that the random variable takes on that value.
Shorthand
\begin{equation} P(Y=k) = p(y), where\ y=k \end{equation}
its written smaller \(y\) represents a case of \(Y\) where \(Y=y\).
Probability of Failure
Last edited: August 8, 2025\begin{align} p_{\text{fail}} &= \mathbb{E}_{\tau \sim p\qty(\cdot)} \qty [1 \qty{\tau \not \in \psi}] \\ &= \int 1 \qty {\tau \not\in \psi} p\qty(\tau) \dd{\tau } \end{align}
that is, the Probability of Failure is just the normalizing constant of the Failure Distribution. Like with Failure Distribution itself, computing this is quite intractable. We have a few methods to solve this, namely:
- direct estimation: directly approximate your failure probability from nominal distribution \(p\) — \(\tau_{j} \sim p\qty(\cdot)\), \(\hat{p}_{\text{fail}} = \frac{1}{m} \sum_{i=1}^{m} 1\qty {\tau_{i} \not \in \psi}\)
- Importance Sampling: design a distribution to probe failure, namely proposal distribution \(q\), and then reweight by how different it is from \(p\) — \(\tau_{j} \sim q\qty(\cdot)\), \(\hat{p}_{\text{fail}} = \frac{1}{m}\sum_{i=1}^{m} w_{i} \mathbb{1} \qty {\tau_{i}\not \in \psi}\), call \(w_{i} = \frac{p\qty(\tau_{i})}{q\qty(\tau_{i})}\) (the “importance weight”)
- adaptive importance sampling
- multiple importance sampling
- sequential monte-carlo
How do you pick a proposal distribution? See proposal distribution.
probablistic models
Last edited: August 8, 2025multinomial distribution
A probability distribution to model specific outcomes like a binomial distribution but for multiple variables.
like binomial distribution, we have to assume independence and same probability per trial.
“what’s the probability that you get some set of assignments xj=nj”:
\begin{equation} P(X_1=c_1, X_2=c_2, \dots, X_{m}=c_{m}) = {n \choose c_1, c_2, \dots, c_{m} } p_{1}^{c_1} \cdot \dots \cdot p_{m}^{c_{m}} \end{equation}
where the big choose is a multinomial coefficient, and \(n\) is the number of different outcomes, and \(p_{j}\) is the probably of the $j$th outcome.
