Succinct trees
Previous slides: Rank & Select, Elias-Fano Link to heading
- Rank: counting 1 bits
- 2-levels of blocks of length \(\log n\) and \(\log^2 n\)
- Select: finding 1 bits
- 2-levels of blocks with \(\log^2 n\) 1-bits and \(\sqrt{\log n}\) 1-bits.
- There exist succinct \(o(n)\) space data structures for rank and select.
- Elias-Fano: efficient encoding of sorted list of integers
- Split in high and \(\ell\)-bit low part; \(\approx 0.5\) bit/elem overhead.
- Encode high parts using stars & bars:
1 for each element, 0 each time a multiple of \(2^\ell\) is crossed.
Today: Succinct representation of trees Link to heading
A.k.a.: applications of rank and select.
- What is a tree?
- How many trees on \(n\) nodes are there?
- How do we encode the tree?
- 3 options: LOUDS (BFS),
- balanced parentheses (in-order DFS),
- DFUDS (pre-order DFS)
- Operations on trees
- degree, parent, \(i\)’th child, subtree size
Application: in-place json parsing
Ordered trees Link to heading
An ordered tree is a tree with:
- \(n\) nodes, \(n-1\) edges,
- a single root,
- an order on the children of each node.
How many ordered trees are there at least? At most?
Bijection to balanced parentheses Link to heading
There is a bijection between ordered trees on \(n\) nodes and balanced parentheses (BP) sequences of \(n-1\) “(” and “)”, where each “(” has a matching “)”.
“Trace” the outline of the tree in DFS-order: each time an edge goes down, write “(”, and each time an edge goes up, write “)”.
Upper bound Link to heading
There are \(o(4^n)\) ordered trees on \(n\) nodes.
The number of BP sequences is at most \[\textsf{#trees} \leq \binom{2(n-1)}{n-1} = \Theta\left(\frac{2^{2n}}{\sqrt n}\right)\]
Lower bound Link to heading
There are \(\Omega(4^n/n^{1.5})\) ordered trees on \(n\) nodes.
Partition the \(\binom{2(n-1)}{n-1}\) possible BP sequences in equivalence classes under rotation of the string. Take a random element of any class. Find the prefix with the largest value of \(\mathsf{count}_)(i) - \mathsf{count}_((i)\), and consider the rotation starting after this prefix. This is a BP. Thus, each class has size \(\leq 2(n-1)\), and contains at least one BP:
\[\textsf{#trees} \geq \frac 1{2(n-1)}\binom{2(n-1)}{n-1} = \Omega\left(\frac{2^{2n}}{n^{1.5}}\right).\]
Space lower bound Link to heading
We need at least \[\frac 1n\log_2(4^n/n^{1.5}) = 2 - o(1)\] bits per node, and this is tight.
Represent trees using \(2 + o(1)\) bits per node, while supporting common operations.
Aside: Catalan numbers Link to heading
The exact number of ordered trees on \(n\) nodes is the Catalan number: \[ C_{n-1} := \frac1{n} \binom{2(n-1)}{n-1} \sim \frac{4^{n-1}}{n^{3/2} \sqrt \pi}. \]
Aside: Bijection to binary trees Link to heading
- Bijection between ordered trees and binary trees.
- Given a binary tree, transform it to an ordered tree like this:
- a left child of a node is a child in the ordered tree.
- a right child of a node is a sibling in the ordered tree.

Level Ordered Unary Degree Sequence (LOUDS) Link to heading
Start with 10.
Then append the degree of each node in level/BFS order in unary:
1 for each child, and then a 0.
10111100110011001100000a_bchiABdeCHjkIDfgEJKFG
- lowercase: introduce node as child.
- uppercase: end the list of children of the node.
\(2n+1\) bits:
- a 1 to add the node as child,
- an ’end’ marker 0 for each node,
- an extra 0.

Operations Link to heading
How to implement operations:
- degree,
- \(i\)’th child,
- parent,
- subtree size.
Advanced:
- depth,
- lowest common ancestor (LCA),
- rank (pre- or post-order),
- …
\[ \newcommand{\rank}{\mathsf{rank}} \newcommand{\select}{\mathsf{select}} \]
Implementing the operations Link to heading
10111100110011001100000a_bchiABdeCHjkIDfgEJKFG

Number nodes \(0\) to \(n-1\) in level/BFS-order.
- Position of 0 of \(u\): \(p_0(u):=\select_0(u+1)\)
- Position of 1 of \(u\): \(p_1(u):=\select_1(u)\)
- Children of \(u\): position \(\select_0(u)+1\) to \(\select_0(u+1)\)
- Degree of \(u\): \(\select_0(u+1) - \select_0(u) - 1\).
- \(i\)’th child of \(u\): (number of children before \(u) + i = \rank_1(\select_0(u)) + 1 + i\)
- Parent of \(u\): the node containing the \(u\)’th 1: \(\rank_0(\select_1(u))-1\)
- Roughly inverse of \(\mathsf{child}\)
- Subtree size: hard; data is all over the place
Balanced Parentheses (BP) Link to heading
Traverse tree in DFS-order.
Write a ( each time a node/subtree is first entered, and a ) a node/subtree
is left.
(()(()(()()))()(()()))1101101101000101101000 10 instead of ()abBcdDefFgGEChHijJkKIA
- lowercase: enter subtree
- uppercase: leave subtree
\(2n\) bits: a ( and ) for each node.
Note: the children of a are all over the place!
\(\rank\) and \(\select\) won’t work here.

Additional operations on BP Link to heading
- \(\mathsf{findclose(i)}\): given
(at pos \(i\), get the position of the matching). - \(\mathsf{findopen(i)}\): given
)at pos \(i\), get the position of the matching(. - \(\mathsf{enclose(i)}\): given
(at pos \(i\), get the position of the closest(left of it enclosing it.
Can all be done in \(O(1)\) time and \(o(n)\) space
Implementing the BP operations Link to heading
(()(()(()()))()(()()))1101101101000101101000 10 instead of ()abBcdDefFgGEChHijJkKIA

Number nodes \(0\) to \(n-1\) in DFS-order.
- Position of ( of \(u\): \(p=\select_((u)\)
- Subtree size of \(u\): \((\mathsf{findclose}(p) - p + 1)/2\)
- Parent of \(u\): \(\mathsf{enclose}(p)\)
- Degree, \(i\)’th child: hard (Navarro and Sadakane 2014)
Depth-First Unary Degree Sequence (DFUDS) Link to heading
Start with 1.
Then traverse the nodes in DFS order, and append for each its degree in unary:
a 1 for each child and then a 0.
Then convert 1 to ( and 0 to ) to get a BP sequence.
((((())(())(())))(()))1111100110011000011000aihcbABedCDgfEFGHkjIJK
- lowercase: introduce node as child.
- uppercase: end the list of children of the node.
\(2n\) bits: 1 to introduce each node,
0 to end the unary degree encoding of each node.

Implementing the DFUDS operations Link to heading
((((())(())(())))(()))1111100110011000011000aihcbABedCDgfEFGHkjIJK

- Position of 0 or ) of \(u\): \(q=\select_0(u)\)
- Degree of \(u\): \(\select_0(u) - \select_0(u-1) - 1\). (Edge case for \(u=0\) )
- \(i\)’th child (position of first 1-bit): next after the matching
): \(\mathsf{findclose}(q-1-i)+1\) - Parent of \(u\): get preceding ), then matching (, then succeeding ): \(\select_0(\rank_0(\mathsf{findopen}(\select_0(u-1))))\)
- Subtree size of \(u\), starting at first bit \(p\): find succeeding
)that leaves the subtree: \((\mathsf{findclose}(\mathsf{enclose}(p)) - p)/2 + 1\)
Implementing findclose and enclose Link to heading
Uses a segment tree (next week) over \(\mathsf{excess}(i) = \rank_((i) - \rank_)(i)\).
- matches depth in the tree
\(\mathsf{findclose}(i) = \min\{j > i: \mathsf{excess}(j)=\mathsf{excess}(i-1)\}\)
\(\mathsf{findopen}(i) = \max\{j < i: \mathsf{excess}(i)=\mathsf{excess}(j-1)\}\)
\(\mathsf{enclose}(i) = \max\{j < i: \mathsf{excess}(i)=\mathsf{excess}(j-1)+2\}\)
Each node of the segment tree stores the difference of ( and ) over its range, and the minimum depth node it contains.
Further reading / sources Link to heading
- Florian Kurpicz’ slides from 2024
- Wikipedia on Catalan numbers
- Wikipedia on binary trees
- Wikipedia on binary tree equivalence
- Paper on succinct trees: Navarro, Gonzalo, and Kunihiko Sadakane. 2014. “Fully Functional Static and Dynamic Succinct Trees.” Acm Transactions on Algorithms 10 (3): 1–39. https://doi.org/10.1145/2601073.
Bibliography Link to heading
References Link to heading
Possible exam questions Link to heading
- How many bit are needed to succinctly encode an ordered tree?
- Why is this sufficient?
- Why is this required?
- What are three different ways to succinctly encode an ordered tree?
- How do they relate?
- How do they differ?
- Which operations are (not) easily supported by each variant?
- Why is it hard to support both ‘subtree size’ and ‘\(i\)’th child’ queries at the same time?