SU-CS161 OCT092025

Motivating Questions

Can we do better than \(O\qty(n \log n)\) sort?

  • What’s our model of computation?
  • What are some reasonable models of computation?

comparison based sorting

merge sort, quicksort, insertion sort.

  • you can’t access values directly
  • you can only compare two items (i.e. if something is better / smaller than another)

Any deterministic, comparison-based sorting algorithm must take \(\Omega\qty(n \log\qty(n))\) steps of computation.

Proof idea—any comparison based sorting algorithm gives rise to a decision tree. The decision tree must have this many steps. The structure of the tree is as follows: nodes are each comparison decision, each nodes has 2 children (one for each ordering between two items), leaf nodes represents the output. The deepest leaf gives the lower bound.

For input with length \(n\), it should have \(n!\) leafs depending on the choice of input ordering. The shallowest such tree would have \(\log \qty(n!)\) depth (i.e. a perfectly balanced tree). \(n!\) is about \(\qty(\frac{n}{e})^{n}\). Taking the log of that gives that our depth is likely \(n \log \qty(n)\).

heuristic based sorting

Many times the content of the thing we are sorting has well ordering (i.e. they are numbers). There are then \(O\qty(n)\) sorting algorithms.

counting sort

Find the maximum of input list is \(O\qty(n)\). Then allocate \(1…\max\) buckets, each a linked list. FIFO insert items into the buckets by iterating through the list again. Then dump the buckets in sorted order.

This is rather silly if you have a very large number (so you would initialize way to many sparsely-populated buckets).

radix sort

Insight: numbers that are “bigger” than each other has each digit equal or bigger than the other digit.

  1. run counting sort on the least significant digit
  2. run counting sort on the second least significant digit
  3. …. repeat ad nausium ….
  4. your list is sorted

And so, proof:

  • IH: after the kth iteration, the array is sorted by the first k significant digits.
  • base case: after 0th digit
  • induction: for two elements \(x,y \in A\) such that \(x<y\), we want to show that up to \(k\) digit is sorted
    • case 1: \(x_1 < y_1\), we desire \(x_2x_1 < y_2y_1\) (x would be in an earlier bucket than y after counting sort, so pulling out the list is just good)
    • case 2: \(x_1 = y_1\), we desire \(x_2 x_1 < y_2 y_1\) (because of FIFO condition, if the first digit was inserted based on \(x_1 < y_1\), then it will be pulled out ditto)
  • improving efficiency

    You can just start encoding the inputs in increasingly smaller bases.

  • general running time

    For \(n\) integers, \(M\) maximum value in input list, is base \(r\).

    • number of iterations: \(d = \lfloor \log_{r}\qty(M) \rfloor + 1\)
    • time per iterations \(O\qty(n+r)\) (\(r\) buckets, \(n\) items into each)
    • total time: \(O\qty(d\cdot\qty(n+r)) = O\qty(\qty(\lfloor \log_{r}\qty(M)\rfloor +1)\cdot\qty(n+r))\)

    We typically choose \(r = n\), which gives:

    \begin{equation} O\qty(n \cdot\qty( \lfloor \log_{n}\qty(M) \rfloor +1 )) \end{equation}

    • if \(M \leq n^{c}\) for constant \(c\), this would be \(O\qty(n)\)
    • if \(M = 2^{n}\), this is \(O\qty(n^{2})\)