The (lost?) art of suffix trees of repetitive texts
Suffixient arrays, Cenzato+
Goal: \(O(r)\) space, \(O(d+m/B+\mathit{occ})\) I/Os
STPD: \(O(r)\) space, \(O(d\color{red}{\cdot \log m}+m/B+\mathit{occ})\) I/Os
Locate one 19x chr19 Locate all
Locate one 19x chr19 Locate all
Substring x for which at least two extensions xA and xB exist in \(T\).
Aka: suffix-tree nodes.
Substring x for which at least two extensions xA and xB exist in \(T\).
Aka: suffix-tree nodes.
Extension xA of an RMS x.
Aka: suffix-tree edges.
A set of text positions \(S\) sorted in co-lex order, such that for each RME xA, there is \(i\in S\) s.t.
\(T[.. i)\) ends in xA.
\(|S|\leq 2r\) is possible. Note that access to the (compressed) text is needed.
Start greedily matching at start of \(T\). Say that prefix \(P[..j)\) of \(P\) matches.
If the extension \(P[..j]\) occurs in \(T\), it is an RME, and we can binary search a prefix from \(S\) that ends in it. Jump there and repeat.
Start greedily matching at start of \(T\). Say that prefix \(P[..j)\) of \(P\) matches.
If the extension \(P[..j]\) occurs in \(T\), it is an RME, and we can binary search a prefix from \(S\) that ends in it. Jump there and repeat.
Start greedily matching at start of \(T\). Say that prefix \(P[..j)\) of \(P\) matches.
If the extension \(P[..j]\) occurs in \(T\), it is an RME, and we can binary search a prefix from \(S\) that ends in it. Jump there and repeat.
Start greedily matching at start of \(T\). Say that prefix \(P[..j)\) of \(P\) matches.
If the extension \(P[..j]\) occurs in \(T\), it is an RME, and we can binary search a prefix from \(S\) that ends in it. Jump there and repeat.
A set of text positions \(S\) sorted in co-lex order, such that for each RME xA, there is \(i\in S\) s.t.
\(T[.. i)\) ends in xA.
Note that access to the (compressed) text is needed.
The set of text positions \(S\), such that for each RME xA
that is not the lefmost occurrence of
x, there is \(i\in S\) s.t. \(T[.. i)\) is the lefmost occurrence of xA.
| (in millions) | influenza | cere | escherichia |
| n | 155.0 | 461.0 | 112.0 |
| BWT runs \(r\) | 3.0 | 11.6 | 15.0 |
| suffixient set \(\chi\leq 2r\) | 2.2 | 9.9 | 13.1 |
| STPD: leftmost | 1.9 | 8.9 | 11.7 |
| STPD: lex min \(\leq r\) | 1.8 | 7.5 | 9.8 |
Outlook: