Lablog on CuriousCoding

[WIP] Progress on fast suffix array searching

Tue, 01 Oct 2024 00:00:00 +0000

Here’s a lablog.

Background

Compare with suffix arrays with a twist: https://www.cai.sk/ojs/index.php/cai/article/view/2019_3_555
Compare with https://github.com/mranisz/sa, which is based on Compact and hash based variants of the suffix array
- https://journals.pan.pl/dlibra/publication/121376/edition/105762/content

Here’s a bike

A figure of a bike.

Binary searching

Eytzinger

Btrees

Multithreading

Mod-minimizers and other minimizers

Thu, 18 Jan 2024 00:00:00 +0100

\[ \newcommand{\d}{\mathrm{d}} \newcommand{\L}{\mathcal{L}} \]

This post introduces some background for minimizers and some experiments for a new minimizer variant. That new variant is now called the mod-minimizer and published at WABI24 (Groot Koerkamp and Pibiri 2024). This also includes a review of existing methods, including pseudocode for most of the methods covered below.

One Billion Row Challenge

Wed, 03 Jan 2024 00:00:00 +0100

Table of Contents

External links
The problem
Initial solution: 105s
First flamegraph
Bytes instead of strings: 72s
Manual parsing: 61s
Inline hash keys: 50s
Faster hash function: 41s
A new flame graph
Perf it is
Something simple: allocating the right size: 41s
memchr for scanning: 47s
memchr crate: 29s
get_unchecked: 28s
Manual SIMD: 29s
Profiling
Revisiting the key function: 23s
PtrHash perfect hash function: 17s
Larger masks: 15s
Reduce pattern matching: 14s
Memory map: 12s
Parallelization: 2.0s
Branchless parsing: 1.7s
Purging all branches: 1.67s
Some more attempts
Faster perfect hashing: 1.55s
Bug time: Back up to 1.71s
Temperatures less than 100: 1.62s
Computing min as a max: 1.50
Intermezzo: Hyperthreading: 1.34s
Not parsing negative numbers: 1.48s
More efficient parsing: 1.44s
Fixing undefined behaviour: back to 1.56s
Lazily subtracting b'0': 1.52s
Min/max without parsing: 1.55s
Parsing using a single multiplication: doesn’t work
Parsing using a single multiplication does work after all! 1.48s
A side note: ASCII
Skip parsing using PDEP: 1.42s
- Improved
- A further note
Branchy min/max: 1.37s
No counting: 1.34s
Arbitrary long city names: 1.34
4 entries in parallel: 1.23s
Mmap per thread
Reordering some operations: 1.19s
Reordering more: 1.11s
Even more ILP: 1.05
Compliance 1, OK I’ll count: 1.06
TODO
Postscript

Since everybody is doing it, I’m also going to take a stab at the One Billion Row Challenge.

Notes on implementing Longest Common Repeat (LCR)

Wed, 06 Dec 2023 00:00:00 +0100

Table of Contents

Notes
Discussion / TODOs
- Evals

These are my running notes on implementing an algorithm for Longest Common Repeat using minimizers.

Notes

Coloured Tree Problem

See Lemma 3 at here

Generic sparse suffix array

For random strings and $b \leq n / \log n$, direct radix sort on $2log n + log log n$-bit prefixes is sufficient for $O(n)$ runtime. In fact, since computer word size $w\geq \log n$, we only need at most $2$ rounds of radix sort! (See simple-saca.)

PTRHash: Notes on adapting PTHash in Rust

Thu, 21 Sep 2023 00:00:00 +0200

Table of Contents

Questions and remarks on PTHash paper
Ideas for improvement
Implementation log
PtrHash, part 2
- Phobic
  - TODO for PTRhash

\[ %\newcommand{\mm}{\,\%\,} \newcommand{\mm}{\bmod} \newcommand{\lxor}{\oplus} \newcommand{\K}{\mathcal K} \]