_index.org

SU-CS361 MAY072024

Last edited: August 8, 2025

Generalization Error

\begin{equation} \epsilon_{gen} = \mathbb{E}_{x \sim \mathcal{X}} \qty[\qty(f(x) - \hat{f}(x))^{2}] \end{equation}

we usually instead of compute it by averaging specific points we measured.

Probabilistic Surrogate Models

Gaussian Process

A Gaussian Process is a Gaussian distribution over functions!

Consider a mean function \(m(x)\), and a covariance (kernel) function \(k(x, x’)\). And, for a set of objective values \(y_{j} \in \mathbb{R}\), which we are trying to infer using \(m\) and \(k\).

\begin{equation} \mqty[y_1 \\ \dots \\ y_{m}] \sim \mathcal{N} \qty(\mqty[m(x_1) \\ \dots \\ m(x_{m})], \mqty[k(x_1, x_1) & \dots & k(x_1, x_{m}) \&\dots&\\ k(x_{m}, x_{1}) &\dots &k(x_{m}, x_{m})]) \end{equation}

SU-CS361 MAY092024

Last edited: August 8, 2025

optimization uncertainty

  • irreducible uncertainty: uncertainty inherent to a system
  • epistemic uncertainty: subjective lack of knowledge about a system from our standpoint

uncertainty can be presented as a vector of random variables, \(z\), where the designer has no control. Feasibility of a design point, then, depends on \((x, z) \in \mathcal{F}\), where \(\mathcal{F}\) is the feasible set of design points.

set-based uncertainty

set-based uncertainty treats uncertainty \(z\) as belonging to some set \(\bold{Z}\). Which means that we typically use minimax to solnve:

SU-CS361 Midterm 1 Review

Last edited: August 8, 2025
  1. error complexity in big \(O\) of finite-difference Midpoint Method, Forward Search, backward search (also what are those)
  2. Fibonacci Search and Golden Section Search
  3. Bisection Method
  4. Shubert-Piyavskill Method
  5. Trust Region Methods equations
  6. Secant Method
  7. the Nelder-Mead Simplex Method chart

SU-CS361 Project Proposal

Last edited: August 8, 2025

Introduction

Reinforcement Learning with Human Feedback (RLHF) has demonstrated superb effect for aligning the performance of a language model (LM) against unsupervised human preference judgements of LM output trajectories ((Ziegler et al. 2020)). However, the original RLHF formulation has shown little direct improvements to the model’s toxicity without further prompting, yet conferred some advantage when prompted to specifically be respectful ((Ouyang et al. 2022)).

To specifically target the problem of the reduction of harmfulness in LM toxicity, varying approaches have been explored via contrastive learning ((Lu et al., n.d.)), or—though a combination of instruction-following RLHF and in-context learning (i.e. prompting)—sampling and LM self-correcting output trajectories ((Ganguli et al. 2023)).