SU-CS361 MAY022024

Sampling Plans

Many methods requires knowing a series of samples of the objective value to calculate local model or population methods, so…

Full Factorial

Grid it up.

easy to implement
good results
bad: sample count grows exponentially with dimension

Random Sampling

Use a pseudorandom generator to pick points in your space.

allows for any number of evaluations you specify
statistically, the points clump when you do this!
also need lots of samples to get good coverage

Uniform Projection

We take each point, and uniformly project it onto each dimension. To implement this, we grid up each dimension and shuffle the ordering of each dimension individually. Then, we read off the coordinates to create the points:

# in d3...

seq = range(axis_min, axis_max)

d1 = random.shuffle(seq)
d2 = random.shuffle(seq)
d3 = random.shuffle(seq)

sampling_points = zip(d1, d2, d3)

Stratified Sampling

perform Uniform Projection
within each grid, make smaller grids and perform within them Uniform Projection again

Space-Filling Metrics

Pairwise Distances

This requires each set to have the same number of points

figure the euclidian distance between every pair of points
for each set of pairs, figure the closest together points, and call that the “pairwise distance” of the set

Limitation: if there are just two points that are close together, this metric scores it worse. So maybe Morris-Mitchell.

Morris-Mitchell

We have a hype-parameter \(q\), which checks all of the possible norms to use between points. Consider \(d_{i}\) to be the ith-pairwise distance between the points with the for your choice of \(p\). Then, for:

\begin{equation} \Phi_{q}(X) = \qty(\sum_{i}^{}d_{i}^{-q})^{\frac{1}{q}} \end{equation}

and we try to solve for our set of points \(X\) such that:

\begin{equation} \min_{X} \max_{q \in \{1,2,5,10,20,50,100\}} \Phi_{q}(X) \end{equation}

“minimize the distance at the worst \(q\) possible norm”

Space-Filling Subset

A Space-Filling Subset is a subset \(S\) of a point set \(X\) which minimizes the maximum distance between a point in \(S\) and its closest point in \(X\) (i.e. making \(S\) a good representative of \(X\)).

\begin{equation} d_{\max}(X,S) = \max_{x \in X} \min_{s \in S} |s -x|_{p} \end{equation}

we can choose any \(p\) norm you’d like.

greedy local search

Choosing one best point to add to \(S\) which maximize \(d_{\max}\), and then choose another point, and another one, …

exchange algorithm

randomly initialize \(S\), and swap points within \(S\) and only in \(X\)