Models of computation

Today Link to heading

Why do we care about models?
Big-\(O\) notation
The complexity of sorting
Properties data structures
The word RAM model
- Variations
- Limitations

Motivation Link to heading

We want to design fast algorithms, but when is an algorithm fast?

Implement it, and measure the time.
- Extrapolate experiments to predict performance on larger \(n\).
- May or may not give understanding.
- Does not provide lower bounds.
Analyse exactly what happens on the CPU.
- Unwieldy; they are massive black boxes.
Instead: use a simplified abstract model of the CPU.
- Allows exact lower and upper bounds.

Big-O notation (recap) Link to heading

Big-\(O\), or \(\mathcal O\) notation relates the asymptotic growth of two abstract functions to each other.

Definition 1 Big-O.

Given two functions \(f, g: \mathbb N \to \mathbb R_{\geq 0}\), we say that

\begin{align*} f(n) &= O(g(n)), \text{ or }\\ f(n)&\in O(g(n)) \end{align*}

when there exists constants \(n_0\in \mathbb N\) and \(M\in \mathbb R\) such that for all sufficiently large \(n\geq n_0\): \[ f(n) \leq M\cdot g(n).\]

Example 1.

\begin{align*} n &= O(n) & \ln n &= O(\sqrt n) \\ n &= O(n^2) & (\lg n)^{100}\cdot n &= O(n^{1.0001}) \\ 1/n &= O(1) & \ln n! &= n \lg n - n + O(\lg n) = O(n \lg n) \end{align*}

Definition 2 Extended notation.

\(f(n) = o(g(n))\) is like \(<\): \[\lim_{n\to\infty} f(n)/g(n) \to 0.\]
\(f(n) = O(g(n))\) is like \(\leq\): \[f(n) \leq M\cdot g(n) \quad \forall n\geq n_0.\]
\(f(n) = \Theta(g(n))\) is like \(=\): \[f(n) = O(g(n)) \text{ and } g(n) = O(f(n)).\]
\(f(n) \sim g(n)\) is a stronger \(\Theta\): \[\lim_{n\to\infty} f(n)/g(n) = 1.\]
\(f(n) = \Omega(g(n))\) is like \(\geq\): \[f(n) \geq M\cdot g(n) \quad \forall n\geq n_0.\]
\(f(n) = \omega(g(n))\) is like \(>\): \[\limsup_{n\to\infty} f(n)/g(n) \to \infty.\]

Quiz time! Link to heading

What is the complexity of sorting? Link to heading

\(O(n \lg n)\)?

\(O(n \lg w)\)?
\(O(n \sqrt {\lg n})\)?
\(O(n \sqrt {\lg \lg n})\)?
Linear \(O(n)\)?
Constant \(O(1)\)?
Something else?

All of the above!

\(O(n \lg n)\)
- Minimum number of comparisons is \(\Omega(n \lg n)\).
- Merge sort indeed uses \(O(n \lg n)\) of them.
- Or QuickSort, with \(O(n \lg n)\) comparisons in expectation.
\(O(n \lg w)\)
- Expected complexity of Van Emde-Boas trees
  (van Emde Boas 1975).
\(O(n \sqrt {\lg n})\)
- Fusion trees (Fredman and Willard 1993)
\(O(n \sqrt {\lg \lg n})\), \(O(n\sqrt{\lg w})\)
- Fastest randomized algorithm
  (Han and Thorup, n.d.)
Linear \(O(n)\)
- Linear space? Sure!
- Counting sort when all values are \(O(n)\).
- Radix sort on 64-bit integers.
- Randomized when \(w \geq (\lg n)^{2+\varepsilon}\)
  (Andersson et al. 1995)
Constant \(O(1)\)
- \(n \leq 2^{64}\) makes \(f(n)\) bounded.
- There’s only so many atoms in the universe.

Figure 1: Algorithm 64: Quicksort (Hoare 1961). RIP Tony Hoare.

“Complexity” Link to heading

Lower or upper bound?
- Complexity of a problem is at least \(\Omega(\dots)\).
- Complexity of an algorithm at most \(O(\dots)\).
- “The complexity”: matching lower and upper bound.
Which metric?
- “Operations”
- Comparisons
- Memory accesses
- Memory touched
- Wall time
- Space usage
What input type?
- Integers
- Floats
- Abstract “objects”
What input properties?
- (Uniform?) random
- Worst-case
What algorithmic properties?
- Deterministic
- Randomized

Properties of data structures Link to heading

Space usage Link to heading

Let \(S\) be the minimum space required to represent some data.

Definition 3 Compact data structure.

A compact data structure uses \(O(S)\) bits of space.

Definition 4 Succinct data structure.

A succinct data structure uses \(S + o(S)\) bits of space.

Example: many Rank & Select data structures are succinct.

Definition 5 Implicit data structure.

An implicit data structure uses \(S + O(1)\) bits of space.

Example: bit arrays.

Dynamic data structures Link to heading

Definition 6 Dynamic data structure.

Allows mutating the structure.

Example: hash tables.

Definition 7 Static data structure.

Build once, then read-only queries.

Example: minimal perfect hash functions.

Definition 8 Incremental data structure.

Insert-only; deletions not allowed.

Example: bloom filters.

Running time Link to heading

From stronger to weaker:

Definition 9 Worst-case running time.

An algorithm has worst-case running time \(O(f(n))\) when for every possible input, it is guaranteed to finishes in \(O(f(n))\) time.

Definition 10 Expected running time.

An algorithm has expected running time \(O(f(n))\) when for every possible input, the expected running time is \(O(f(n))\).

Definition 11 Running time with high probability.

An algorithm has running time \(O(f(n))\) with high probability (w.h.p.) when for every possible input, the running time is \(O(f(n))\) with probability \(1-o(1)\).

Definition 12 Amortized running time.

An operation on a data structure has amortized running time \(O(f(n))\) when for every possible sequence of \(i\) operations with amortized complexities \(f_i\), the total running time is \(O(\sum_i f_i)\).

Example: pushing on a vector is worst-case \(O(n)\), but amortized \(O(1)\).

Models of computation Link to heading

The RAM model Link to heading

Definition 13 Random-access machine (RAM) model.

Models a machine that has an infinite list of registers.

Each register can store an unbounded (!) natural number!
Each register has an address.
Retrieving the register corresponding to an address takes constant time.

Question 1.

Any problems with this?

Answer 1.

Infinite memory, infinite registers, and constant-time access are strong assumptions.

The real RAM model Link to heading

Definition 14.

Like the RAM model, but each register can store a real number.

The word RAM model Link to heading

Definition 15.

Like the RAM model, but operates on words of \(w\) bits.

Registers are also \(w\) bits.
Constant-time \(w\)-bit arithmetic operations:
- bit-operations, addition, multiplication.

Evolved with CPUs to include: https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set

Popcount (popcnt), count trailing zeros (tzcnt, BMI1)
Parallel bit deposit (pdep), parallel bit extract (pext, BMI2)

Most commonly used model.

Question 2.

Any remaining problems with this?

Answer 2.

Word RAM still assumes constant-time access. Constant time \(w\)-bit multiplication?

We must talk about \(w\) Link to heading

In practice, \(w=64\) is constant.
But the problem size \(n\) grows to infinity.
What if \(n > 2^w\)? Then we can’t even represent \(n\)?

Solution: we assume \[ \lg n \leq w. \] This has far-reaching implications!

The complexity of pop-counting \(n\) bits is sub-linear \(O(n / \lg n)\)!
Fusion trees can binary search on \(O(\sqrt w)\) \(w\)-bit integers in constant \(O(1)\) time!

The pointer model Link to heading

Definition 16 Pointer model.

RAM-model without the RAM:

Registers can only be accessed via direct pointers.

Used in the analysis of some heap-based priority queues.

The cell-probe model Link to heading

Definition 17 Cell-probe model.

Like the RAM-model, but all operations apart from accessing memory are free.

Typically, an infinite cache is assumed, so that each address only has to be read once.
Used for lower bounds on the number of addresses that must be read to solve some task, and thus a lower bound on RAM-model complexity.

The RAM-model is unphysical Link to heading

Infinite memory with constant time access is fundamentally not possible:
- Each bit takes some small volume \(v\) to store.
- Signals can only travel at the speed of light \(c\).
- In time \(t\), we can read from a sphere of radius \(tc\), containing \[\frac 43 \pi (tc)^3/v = \Theta(t^3)\quad \text{ bits}.\]

Observation 1 Latency.

The latency of a uniform random access into a memory of size \(n\) is at least \[\Omega(\sqrt[3]{n}).\]

The RAM-model is unphysical (2) Link to heading

Black holes have only quadratic mass \(\Theta(r^2)\)!
- Filling a volume with cubic information is impossible.
- But we are far away from the constant.
If each bit of memory requires some energy, we need to dissipate \(\Theta(r^3)\) heat through \(\Theta(r^2)\) of surface area.
- This is an actual bottleneck for CPUs! They are mostly flat (2.5D) for a reason!
  - AMD now has “3D V-Cache” which vertically stacks caches.

Observation 2 Black holes and cooling (handwavy).

We can store/cool at most \(O(r^2)\) bits in a sphere of radius \(r\), and so the latency of a uniform random access into a memory of size \(n\) is at least \[\Omega(\sqrt[2]{n}).\]

See this blog: The Myth of RAM

The RAM-model is unrealistic Link to heading

Green: latency of array indexing.
- Clearly not \(O(1)\)!
- More like \(\sqrt[3] n\) (dashed blue)
Black: latency of binary search.
- Clearly not \(O(\lg n)\) (solid blue)
Red: binary search with heap-layout
- \(<3\times\) slower than just indexing?!

\(\sqrt[3]{1} + \sqrt[3]{2}+\dots+\sqrt[3]{n} = \Theta(\sqrt[3]n)\).

TODO: \(\sqrt n\)-complexity model!

The I/O-complexity Link to heading

Introduced by Aggarwal and Vitter (1988).

Definition 18 I/O-complexity.

Like the cell-probe model, the I/O-complexity only counts I/O operations.

Fast and free internal memory of \(M\) words.
Slow (infinite) external memory.
Count I/O-operations between them go in blocks of \(B\) words.

Theorem 1 I/O-complexity of sorting.

Sorting \(n\) word-sized integers has an I/O-complexity lower-bound of \[ \Omega\left(\frac n B \log_{M/B} \frac nB\right), \] and external merge-sort achieves this.

Takeaways Link to heading

Observation 3.

We need precisely defined models to theoretically analyse algorithms.

Observation 4.

Models approximate reality; some do so better than others.

The rest of this course will mostly use the word RAM model to analyse both theoretically and practically efficient algorithms.

Next week: Bitvectors, Rank, & Select Link to heading

Bibliography Link to heading

References Link to heading

Aggarwal, Alok, and S. Vitter Jeffrey. 1988. “The Input/Output Complexity of Sorting and Related Problems.” Communications of the Acm 31 (9): 1116–27. https://doi.org/10.1145/48529.48535.

Andersson, Arne, Torben Hagerup, Stefan Nilsson, and Rajeev Raman. 1995. “Sorting in Linear Time?” In Proceedings of the Twenty-Seventh Annual Acm Symposium on Theory of Computing - Stoc ’95, 427–36. Stoc ’95. ACM Press. https://doi.org/10.1145/225058.225173.

Emde Boas, P. van. 1975. “Preserving Order in a Forest in Less than Logarithmic Time.” In 16Th Annual Symposium on Foundations of Computer Science (Sfcs 1975). IEEE. https://doi.org/10.1109/sfcs.1975.26.

Fredman, Michael L., and Dan E. Willard. 1993. “Surpassing the Information Theoretic Bound with Fusion Trees.” Journal of Computer and System Sciences 47 (3): 424–36. https://doi.org/10.1016/0022-0000(93)90040-4.

Han, Yijie, and M. Thorup. n.d. “Integer Sorting in O(N√(Log Log N)) Expected Time and Linear Space.” In The 43rd Annual Ieee Symposium on Foundations of Computer Science, 2002. Proceedings., 135–44. Sfcs-02. IEEE Comput. Soc. https://doi.org/10.1109/sfcs.2002.1181890.

Hoare, C. A. R. 1961. “Algorithm 64: Quicksort.” Communications of the Acm 4 (7): 321. https://doi.org/10.1145/366622.366644.

Khuong, Paul-Virak, and Pat Morin. 2017. “Array Layouts for Comparison-Based Searching.” Acm Journal of Experimental Algorithmics 22 (May): 1–39. https://doi.org/10.1145/3053370.

Possible exam questions Link to heading

What is big-O notation? Name some of the different variants.
What are some problems with saying “the complexity of sorting”?
Give an example of a precisely specified complexity of an algorithm.
What is the difference between compact, succinct, and implicit data structures? Give an example for each of them.
What is the difference between worst-case, expected, and amortized running time?
Name 5 different models of computation.
What are some problems with the word RAM model? Why do we use it anyway? Is there a model that solves these issues?
What can you say about \(w\)?