<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Highlight on CuriousCoding</title><link>https://curiouscoding.nl/tags/highlight/</link><description>Recent content in Highlight on CuriousCoding</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 25 Jan 2026 00:00:00 +0100</lastBuildDate><atom:link href="https://curiouscoding.nl/tags/highlight/index.xml" rel="self" type="application/rss+xml"/><item><title>CI for Rust testing and releasing</title><link>https://curiouscoding.nl/posts/release-flow/</link><pubDate>Sun, 25 Jan 2026 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/release-flow/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#testing" &gt;Testing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#releasing-libraries" &gt;Releasing libraries&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#changelog" &gt;Changelog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#cargo-release" &gt;&lt;code&gt;cargo release&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#cratesio" &gt;crates.io&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#releasing-binaries" &gt;Releasing binaries&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#cratesio-bin" &gt;Crates.io&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#avx2" &gt;The pain of AVX2&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2.1&lt;/span&gt; &lt;a href="#ensure-simd" &gt;&lt;code&gt;ensure_simd&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#profile-selection" &gt;Profile selection&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#github" &gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#binstall" &gt;&lt;code&gt;cargo binstall&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#pypi" &gt;PyPI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7&lt;/span&gt; &lt;a href="#bioconda" &gt;Bioconda&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#conclusion" &gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This post will collect template files for setting up
GitHub CI for testing and releasing Rust libraries and binaries to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;crates.io (&lt;a href="#cratesio" &gt;libraries&lt;/a&gt;, &lt;a href="#cratesio-bin" &gt;binaries&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="#github" &gt;GitHub releases&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#binstall" &gt;cargo binstall&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#pypi" &gt;PyPI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#bioconda" &gt;Bioconda&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We also have some workarounds for dealing with &lt;a href="#avx2" &gt;AVX2 SIMD instructions&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Trying to understand DDR memory</title><link>https://curiouscoding.nl/posts/ddr/</link><pubDate>Tue, 20 Jan 2026 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/ddr/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#questions" &gt;Questions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#a-load-of-articles-blogs-pages-to-read" &gt;A load of articles/blogs/pages to read&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#wikipedia-articles" &gt;Wikipedia articles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#more-posts" &gt;More posts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#notes" &gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#my-own-ram" &gt;My own RAM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#continued-notes" &gt;Continued notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.6&lt;/span&gt; &lt;a href="#address-mapping-notation" &gt;Address mapping notation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.7&lt;/span&gt; &lt;a href="#intel-spec" &gt;Intel spec&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.8&lt;/span&gt; &lt;a href="#rank-interleaving" &gt;Rank interleaving&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.9&lt;/span&gt; &lt;a href="#nontemporal-reads-writes" &gt;Nontemporal reads/writes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#remap-using-performance-counters" &gt;reMap: using Performance counters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#sudoku" &gt;Sudoku&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#step-1-dram-addressing-functions" &gt;Step 1: DRAM addressing functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#step-2-row-column-bits" &gt;Step 2: row/column bits&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#step-3-validation" &gt;Step 3: validation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.4&lt;/span&gt; &lt;a href="#step-4-which-function-is-what" &gt;Step 4: which function is what?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.5&lt;/span&gt; &lt;a href="#refreshes" &gt;Refreshes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.6&lt;/span&gt; &lt;a href="#consecutive-accesses" &gt;Consecutive Accesses&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#sudoku-now-with-only-1-dimm" &gt;Sudoku, now with only 1 DIMM&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.1&lt;/span&gt; &lt;a href="#setup" &gt;setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.2&lt;/span&gt; &lt;a href="#1-dot-reverse-functions" &gt;1. reverse functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.3&lt;/span&gt; &lt;a href="#2-dot-identify-bits" &gt;2. identify bits&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.4&lt;/span&gt; &lt;a href="#3-dot-validate-mapping" &gt;3. validate mapping&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.5&lt;/span&gt; &lt;a href="#4-dot-decompose-functions" &gt;4. decompose functions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#results" &gt;Final results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#decode-dimms" &gt;&lt;code&gt;decode-dimms&lt;/code&gt;&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.1&lt;/span&gt; &lt;a href="#bank-groups" &gt;Bank groups&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.2&lt;/span&gt; &lt;a href="#refresh" &gt;Refresh&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.3&lt;/span&gt; &lt;a href="#random-access-throughput" &gt;Random access throughput&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#cpu-benchmarks" &gt;CPU benchmarks&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.1&lt;/span&gt; &lt;a href="#cpu-benchmarks" &gt;cpu-benchmarks&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.1.1&lt;/span&gt; &lt;a href="#random-access-throughput-1-dimm" &gt;random access throughput 1 DIMM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.1.2&lt;/span&gt; &lt;a href="#random-access-throughput-2-dimm" &gt;random access throughput 2 DIMM&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.2&lt;/span&gt; &lt;a href="#memory-read-experiment" &gt;memory-read-experiment&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.2.1&lt;/span&gt; &lt;a href="#strided-reading-1-dimm" &gt;strided reading 1 DIMM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.2.2&lt;/span&gt; &lt;a href="#strided-reading-2-dimm" &gt;strided reading 2 DIMM&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;9&lt;/span&gt; &lt;a href="#tinymembench" &gt;&lt;code&gt;tinymembench&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;10&lt;/span&gt; &lt;a href="#remaining-questions" &gt;Remaining questions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;These are chronological (and thus, only lightly organized) notes on my attempt to
understand how DDR4 and DDR5 RAM memory work.&lt;/p&gt;</description></item><item><title>Asymptotic elevators</title><link>https://curiouscoding.nl/posts/asymptotic-elevators/</link><pubDate>Mon, 22 Dec 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/asymptotic-elevators/</guid><description>&lt;p&gt;I was listening to an episode of the &lt;em&gt;well there&amp;rsquo;s your problem&lt;/em&gt; podcast about
pencil towers (&lt;a href="https://www.youtube.com/watch?v=BvMYplJ59TE&amp;amp;t=11297s" class="external-link" target="_blank" rel="noopener"&gt;youtube&lt;/a&gt;), and it had a section on how elevators are a problem because they
require a lot of space. So here&amp;rsquo;s a mathematical version of that.&lt;/p&gt;
&lt;h3 id="problem-statement"&gt;
 Problem statement
 &lt;a class="heading-link" href="#problem-statement"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Given are \(n\) people that need to go to floors \(1, 2, \dots, n\).&lt;/li&gt;
&lt;li&gt;Elevators have constant acceleration, and must be standing still to
enter/exit.&lt;sup id="fnref:1"&gt;&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref"&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;li&gt;Possible elevator configurations:
&lt;ol&gt;
&lt;li&gt;single elevator over entire height&lt;/li&gt;
&lt;li&gt;partition the height in disjoint intervals, and then one elevator per interval&lt;/li&gt;
&lt;li&gt;double-deck: a single elevator that is \(h\) floors high&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Not&lt;/em&gt; allowed: two free-moving elevators above each other that make sure to
never bump into each other.&lt;/li&gt;
&lt;li&gt;Elevators have infinite capacity.&lt;/li&gt;
&lt;li&gt;There are \(k\) elevator shafts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Question:&lt;/strong&gt; How much total travel time do you need to get everyone home, if
everybody arrives at the same time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Harder(?):&lt;/strong&gt; What if the people arrive in a random permutation (1 per time
step), and their clock starts ticking as soon as they arrive?&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="observations"&gt;
 Observations
 &lt;a class="heading-link" href="#observations"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Going \(n\) floors up takes at least \(O(\sqrt n)\) time.&lt;/li&gt;
&lt;li&gt;Total travel time is at least \(\sum_{i=0}^n O(\sqrt i) = O(n \sqrt n)\)&lt;/li&gt;
&lt;li&gt;\(1\) elevator carrying everyone going 1 step at a time: \(O(n^2)\) total time&lt;/li&gt;
&lt;li&gt;$2$-elevator sqrt-decomposition: one elevator stops every \(\sqrt n\) floors,
and then a (set of) second elevators for the final up to \(\sqrt n\) floors.
&lt;ul&gt;
&lt;li&gt;first elevator: \(n\) people times \(n/(\sqrt n)/2\) &amp;lsquo;big steps&amp;rsquo; on average
times \(\sqrt{\sqrt n}\) time per big step is \(O(n^{7/4})\)&lt;/li&gt;
&lt;li&gt;second elevator: \(n\) people times \((\sqrt n)\) &amp;lsquo;small steps&amp;rsquo; on average
times \(1\) per small step is \(O(n^{3.2})\)&lt;/li&gt;
&lt;li&gt;so overall the big steps dominate&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;$2$-elevator, one that stops every \(B\) floors and then \(B\) that stop every
one floor:
&lt;ul&gt;
&lt;li&gt;the first one: \(n \cdot (n/B) \cdot \sqrt B = n^2/\sqrt B\)&lt;/li&gt;
&lt;li&gt;the second one: \(n \cdot B = nB\).&lt;/li&gt;
&lt;li&gt;solve for equality: \(B = n/\sqrt B\) =&amp;gt; \(B = n^{2/3}\)&lt;/li&gt;
&lt;li&gt;so \(n^{1+2/3}\) solution&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;\(\lg n\) elevator binary tree:
&lt;ul&gt;
&lt;li&gt;\(2^k \leq n \leq 2^{k+1}\)&lt;/li&gt;
&lt;li&gt;Take first elevator to \(2^k\) if needed: time \(\sqrt{2^k} = O(\sqrt n)\)&lt;/li&gt;
&lt;li&gt;There take second elevator that goes up \(2^{k-1}\) floors if needed.&lt;/li&gt;
&lt;li&gt;Then \(2^{k-2}\) levels up&lt;/li&gt;
&lt;li&gt;and so on until the last floor.&lt;/li&gt;
&lt;li&gt;Total time per person averages \(\sqrt{2^k}/2 + \sqrt{2^{k-1}}/2 + \dots + \sqrt{2}/2 + \sqrt{1}/2 =
O(\sqrt{2^k}) = O(\sqrt n)\), so up-to-a-constant optimal total travel
time \(O(n \sqrt n)\).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Open questions:&lt;/strong&gt;&lt;/p&gt;</description></item><item><title>Overview of static data structures</title><link>https://curiouscoding.nl/posts/static-data-structures/</link><pubDate>Wed, 17 Dec 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/static-data-structures/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#classification-of-static-data-structures" &gt;Classification of static data structures&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#space-lower-bounds-and-practical-approaches" &gt;Space lower bounds and practical approaches&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#rank" &gt;Rank&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#rank-plus-select" &gt;Rank + Select&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#minimal-perfect-hash-function--mphf" &gt;Minimal perfect hash function (MPHF)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#monotone-mphf" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; Monotone MPHF&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#order-preserving-mphf" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; Order-preserving MPHF&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.6&lt;/span&gt; &lt;a href="#static-retrieval-static-function-with-static-values" &gt;Static retrieval: Static function with static values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.7&lt;/span&gt; &lt;a href="#updatable-retrieval-static-function-with-mutable-values" &gt;Updatable retrieval: Static function with mutable values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.8&lt;/span&gt; &lt;a href="#static-set--membership" &gt;Static set (membership)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.9&lt;/span&gt; &lt;a href="#static-ordered-set" &gt;Static ordered set&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.10&lt;/span&gt; &lt;a href="#static-dictionary-static-keys-and-values" &gt;Static dictionary: static keys and values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.11&lt;/span&gt; &lt;a href="#updatable-dictionary-with-mutable-values" &gt;Updatable dictionary with mutable values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.12&lt;/span&gt; &lt;a href="#dynamic-dictionary-with-mutable-keys-and-values" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; Dynamic dictionary with mutable keys and values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.13&lt;/span&gt; &lt;a href="#static-filter" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; Static filter&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.14&lt;/span&gt; &lt;a href="#ordered-static-updatable-dynamic-dictionary" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; Ordered static/updatable/dynamic dictionary?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#summary" &gt;Summary table&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\[
\newcommand{\K}{\mathbb K}
\newcommand{\V}{\mathbb V}
\newcommand{\c}[1]{\mathbf{\mathsf{#1}}}
\]&lt;/p&gt;</description></item><item><title>Three log scientist</title><link>https://curiouscoding.nl/posts/three-log-scientist/</link><pubDate>Tue, 12 Aug 2025 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/three-log-scientist/</guid><description>&lt;p&gt;A rating system for theoretical computer scientists.
The more logarithms there are (i.e. the more &amp;ldquo;\(\log\)&amp;rdquo; before your variables),
the higher your reputation will be.
No-log theoretical computer scientists are virtually non-existent, as virtually
all non-trivial algorithms require use of logarithms.
Most are one-log scientists.
In the old times (well, I&amp;rsquo;m young, so these look like old times to me at least), one would occasionally find a piece of code done by a three-log scientist and shiver with awe.&lt;/p&gt;</description></item><item><title>Beyond Global Alignment</title><link>https://curiouscoding.nl/posts/mapping/</link><pubDate>Mon, 24 Mar 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/mapping/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#semi-global-variants" &gt;Variants of semi-global alignment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#text-searching" &gt;Fast text searching&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#skip-cost-for-overlap-alignments" &gt;Skip-cost for overlap alignments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#search-results" &gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#mapping" &gt;Mapping using A*Map&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#seeding" &gt;Seeding&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#chaining" &gt;Chaining&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#aligning" &gt;Aligning&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#a-map" &gt;A*Map&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#results" &gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is Chapter 5 of my thesis (&lt;a href="#citeproc_bib_item_11"&gt;Groot Koerkamp 2025&lt;/a&gt;).&lt;/p&gt;
&lt;hr&gt;
&lt;div class="notice summary"&gt;
 &lt;div class="notice-title"&gt;
 &lt;i class="fa-solid " aria-hidden="true"&gt;&lt;/i&gt;Summary
 &lt;/div&gt;
 &lt;div class="notice-content"&gt;
&lt;p&gt;So far, we have considered only algorithms for &lt;em&gt;global&lt;/em&gt; alignment.
In this chapter, we consider &lt;em&gt;semi-global&lt;/em&gt; alignment and its variants instead,
where a pattern (query) is searched in a longer string (reference).
There are many flavours of semi-global alignment, depending on the
(relative) sizes of the inputs. We list these variants, and introduce
some common approaches to solve this problem.&lt;/p&gt;</description></item><item><title>SimdSketch: a fast bucket sketch</title><link>https://curiouscoding.nl/posts/simd-sketch/</link><pubDate>Sun, 09 Mar 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/simd-sketch/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#jaccard-similarity" &gt;Jaccard similarity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#hash-schemes" &gt;Hash schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#minhash" &gt;MinHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#s-mins-sketch" &gt;$s$-mins sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#bottom-s" &gt;Bottom-\(s\) sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#fracminhash" &gt;FracMinHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#bucket-sketch" &gt;Bucket sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.6&lt;/span&gt; &lt;a href="#mod-bucket-hash--new" &gt;Mod-bucket hash (new?)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.7&lt;/span&gt; &lt;a href="#variants" &gt;Variants&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#compressing-sketches" &gt;Compressing sketches&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#b-bit-hashing" &gt;$b$-bit hashing&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1.1&lt;/span&gt; &lt;a href="#accounting-for-collisions" &gt;Accounting for collisions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#hyperminhash" &gt;HyperMinHash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#densification-strategies" &gt;Densification strategies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#simdsketch" &gt;SimdSketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#evaluation" &gt;Evaluation&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1&lt;/span&gt; &lt;a href="#setup" &gt;Setup&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.1&lt;/span&gt; &lt;a href="#tools" &gt;Tools&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.2&lt;/span&gt; &lt;a href="#inputs" &gt;Inputs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.3&lt;/span&gt; &lt;a href="#parameters" &gt;Parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.4&lt;/span&gt; &lt;a href="#metrics" &gt;Metrics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.2&lt;/span&gt; &lt;a href="#raw-results" &gt;Raw results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.3&lt;/span&gt; &lt;a href="#correlation" &gt;Correlation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.4&lt;/span&gt; &lt;a href="#comparison-speed" &gt;Comparison speed&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.5&lt;/span&gt; &lt;a href="#low-similarity-data" &gt;Low-similarity data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#future-work" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; / Future work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\[
\newcommand{\sketch}{\mathsf{sketch}}
\]&lt;/p&gt;</description></item><item><title>Thesis: Optimal Throughput Bioinformatics</title><link>https://curiouscoding.nl/posts/thesis/</link><pubDate>Sun, 23 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/thesis/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#abstract" &gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#part-1-pairwise-alignment" &gt;Part 1: Pairwise Alignment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#part-2-low-density-minimizers" &gt;Part 2: Low Density Minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.3&lt;/span&gt; &lt;a href="#part-3-high-throughput-bioinformatics" &gt;Part 3: High Throughput Bioinformatics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#discussion" &gt;Discussion&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#pairwise-alignment" &gt;Pairwise Alignment&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#low-density-minimizers" &gt;Low Density Minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#high-throughput-bioinformatics" &gt;High Throughput Bioinformatics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#propositions" &gt;Propositions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This post contains the abstract, introduction, and conclusion of my thesis (&lt;a href="#citeproc_bib_item_3"&gt;Groot Koerkamp 2025a&lt;/a&gt;).
Individual chapters are based either on blog posts or papers and linked from the introduction.&lt;/p&gt;</description></item><item><title>A History of Pairwise Alignment</title><link>https://curiouscoding.nl/posts/pairwise-alignment/</link><pubDate>Sat, 22 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/pairwise-alignment/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#a-brief-history" &gt;A Brief History&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#a-pa" &gt;A*PA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#a-pa2" &gt;A*PA2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.3&lt;/span&gt; &lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#problem-statement" &gt;Problem Statement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#alignment-types" &gt;Alignment types&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#cost-models" &gt;Cost Models&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#minimizing-cost-versus-maximizing-score" &gt;Minimizing Cost versus Maximizing Score&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#dp" &gt;The Classic DP Algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#linear-memory-using-divide-and-conquer" &gt;Linear Memory using Divide and Conquer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#graphs" &gt;Dijkstra&amp;rsquo;s Algorithm and A*&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#computational-volumes" &gt;Computational Volumes and Band Doubling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;9&lt;/span&gt; &lt;a href="#diagonal-transition" &gt;Diagonal Transition&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;10&lt;/span&gt; &lt;a href="#parallelism" &gt;Parallelism&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;11&lt;/span&gt; &lt;a href="#lcs-and-contours" &gt;LCS and Contours&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;12&lt;/span&gt; &lt;a href="#some-tools" &gt;Some Tools&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;13&lt;/span&gt; &lt;a href="#subquadratic-methods-and-lower-bounds" &gt;Subquadratic Methods and Lower Bounds&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;14&lt;/span&gt; &lt;a href="#summary" &gt;Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is Chapter 2 of my thesis (&lt;a href="#citeproc_bib_item_27"&gt;Groot Koerkamp 2025&lt;/a&gt;), to introduce the first part on Pairwise Alignment.&lt;/p&gt;</description></item><item><title>Low Density Minimizers</title><link>https://curiouscoding.nl/posts/minimizers/</link><pubDate>Fri, 21 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/minimizers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#theory-of-sampling-schemes" &gt;Theory of Sampling Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.3&lt;/span&gt; &lt;a href="#theory-of-sampling-schemes" &gt;Theory of sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.4&lt;/span&gt; &lt;a href="#notation" &gt;Notation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.5&lt;/span&gt; &lt;a href="#types-of-sampling-schemes" &gt;Types of sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.6&lt;/span&gt; &lt;a href="#computing-the-density" &gt;Computing the density&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.7&lt;/span&gt; &lt;a href="#random-mini-density" &gt;The density of random minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.8&lt;/span&gt; &lt;a href="#universal-hitting-sets" &gt;Universal hitting sets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.9&lt;/span&gt; &lt;a href="#asymptotic-results" &gt;Asymptotic results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.10&lt;/span&gt; &lt;a href="#variants" &gt;Variants&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#lower-bounds" &gt;Lower Bounds on Sampling Scheme Density&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#schleimer-et-al-dot-s-bound" &gt;Schleimer et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#mar%c3%a7ais-et-al-dot-s-bound" &gt;Marçais et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#improving-and-extending-mar%c3%a7ais-et-al-dot-s-bound" &gt;Improving and extending Marçais et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#near-tight-lb" &gt;A near-tight lower bound on the density of forward sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#lower-bound-eval" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#sampling-schemes" &gt;Practical Sampling Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#lexmin" &gt;Variants of lexicographic minimizers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#lex-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#uhs-inspired-schemes" &gt;UHS-inspired schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#syncmer-based-schemes" &gt;Syncmer-based schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#open-closed-minimizer" &gt;Open-closed minimizer&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#oc-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#modmini" &gt;Mod-minimizer&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#theoretical-density" &gt;Theoretical density&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#modmini-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#sampling-schemes-discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#selection-schemes" &gt;Towards Optimal Selection Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#bd-anchors" &gt;Bidirectional anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#sus-anchors" &gt;Sus-anchors&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#sus-anchor-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#selection-schemes-discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is Part 2 of my thesis (&lt;a href="#citeproc_bib_item_15"&gt;Groot Koerkamp 2025&lt;/a&gt;), containing chapters 6 to 9 on Low Density Minimizers.&lt;/p&gt;</description></item><item><title>High Throughput Bioinformatics</title><link>https://curiouscoding.nl/posts/throughput/</link><pubDate>Thu, 20 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/throughput/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#compute-bound" &gt;Optimizing Compute Bound Code: Random Minimizers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#avoiding-branch-misses" &gt;Avoiding Branch Misses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#simd-processing-in-parallel" &gt;SIMD: Processing In Parallel&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#instruction-level-parallelism" &gt;Instruction Level Parallelism&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#input-format" &gt;Input Format&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#memory-bound" &gt;Optimizing Memory Bound Code: Minimal Perfect Hashing&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#using-less-memory" &gt;Using Less Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#reducing-memory-accesses" &gt;Reducing Memory Accesses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#interleaving-memory-accesses" &gt;Interleaving Memory Accesses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#batching-streaming-and-prefetching" &gt;Batching, Streaming, and Prefetching&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is Chapter 10 of my thesis (Groot Koerkamp 2025a), to introduce the last part on High Throughput Bioinformatics.&lt;/p&gt;</description></item><item><title>PtrHash: Minimal Perfect Hashing at RAM Throughput</title><link>https://curiouscoding.nl/posts/ptrhash/</link><pubDate>Mon, 03 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/ptrhash/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#abstract" &gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#sec:orgebb9721" &gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#sec:orgfe4e2e9" &gt;Related work&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#sec:orgce4a522" &gt;PtrHash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#sec:org06ce748" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#sec:construction" &gt;Construction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#sec:bucket-fn" &gt;Bucket Assignment Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#remapping" &gt;Remapping using CacheLineEF&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#sec:orgbf28892" &gt;Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#construction-eval" &gt;Construction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1.1&lt;/span&gt; &lt;a href="#sec:orge11d60c" &gt;Bucket Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1.2&lt;/span&gt; &lt;a href="#sec:org9f908d8" &gt;Tuning Parameters for Construction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1.3&lt;/span&gt; &lt;a href="#sec:orgece074a" &gt;Remap&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#sec:comparison" &gt;Comparison to Other Methods&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#sec:org9f032dd" &gt;Conclusions and Future Work&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#sec:throughput" &gt;Appendix A: Query Throughput&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#sec:orgabb5dd4" &gt;Batching and Streaming&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#throughput-evaluation" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#multi-threaded-throughput." &gt;Multi-threaded Throughput.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#sec:sharding" &gt;Appendix B: Sharding&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#sec:sharding-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is the HTML version of my SEA 2025 paper on PtrHash (&lt;a href="https://doi.org/10.48550/arXiv.2502.15539" class="external-link" target="_blank" rel="noopener"&gt;DOI&lt;/a&gt;, &lt;a href="https://curiouscoding.nl/papers/ptrhash.pdf" &gt;PDF&lt;/a&gt;).
The original development-log can be found &lt;a href="../ptrhash-log" &gt;here&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>SimdMinimizers: Computing random minimizers, fast</title><link>https://curiouscoding.nl/posts/simd-minimizers/</link><pubDate>Fri, 12 Jul 2024 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/simd-minimizers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#intro-results" &gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#random-minimizers" &gt;Random minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#algorithms" &gt;Algorithms&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#problem-statement" &gt;Problem statement&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#problem-a-only-the-set-of-minimizers" &gt;Problem A: Only the set of minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#problem-b-the-minimizer-of-each-window" &gt;Problem B: The minimizer of each window&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#problem-c-super-k-mers" &gt;Problem C: Super-k-mers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#which-problem-to-solve" &gt;Which problem to solve&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#canonical-k-mers" &gt;Canonical k-mers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#the-naive-algorithm" &gt;The naive algorithm&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#naive-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#rephrasing-as-sliding-window-minimum" &gt;Rephrasing as sliding window minimum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#the-queue" &gt;The queue&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#queue-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#jumping-away-with-the-queue" &gt;Jumping: Away with the queue&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#jumping-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#re-scan" &gt;Re-scan&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#rescan-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7&lt;/span&gt; &lt;a href="#split-windows" &gt;Split windows&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#split-perfomance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#analysing-what-we-have-so-far" &gt;Analysing what we have so far&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#counting-comparisons" &gt;Counting comparisons&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#open-problem-theoretical-lower-bounds" &gt;Open problem: Theoretical lower bounds&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#setting-up-benchmarking" &gt;Setting up benchmarking&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#adding-criterion" &gt;Adding criterion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#making-criterion-fast" &gt;Making criterion fast&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-note-on-cpu-frequency" &gt;A note on CPU frequency&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#runtime-comparison-with-other-implementations" &gt;Runtime comparison with other implementations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.4&lt;/span&gt; &lt;a href="#deeper-inspection-using-perf-stat" &gt;Deeper inspection using &lt;code&gt;perf stat&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.5&lt;/span&gt; &lt;a href="#a-first-optimization-pass" &gt;A first optimization pass&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#optimizing-buffered-reducing-branch-misses" &gt;Optimizing &lt;code&gt;Buffered&lt;/code&gt;: reducing branch misses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#queue-is-hopelessly-branchy" &gt;&lt;code&gt;Queue&lt;/code&gt; is hopelessly branchy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#jumping-is-already-very-efficient" &gt;&lt;code&gt;Jumping&lt;/code&gt; is already very efficient&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#optimizing-rescan" &gt;Optimizing &lt;code&gt;Rescan&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#optimizing-split" &gt;Optimizing &lt;code&gt;Split&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.6&lt;/span&gt; &lt;a href="#a-new-performance-comparison" &gt;A new performance comparison&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#rolling-our-own-hash" &gt;Rolling our own hash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.1&lt;/span&gt; &lt;a href="#fxhash" &gt;FxHash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#wyhash" &gt;WyHash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.2&lt;/span&gt; &lt;a href="#nthash-a-rolling-hash" &gt;NtHash: a rolling hash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#the-nthash-crate" &gt;The &lt;code&gt;nthash&lt;/code&gt; crate&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#buffered-hash-values" &gt;Buffered hash values&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.3&lt;/span&gt; &lt;a href="#making-nthash-fast-going-branchless" &gt;Making ntHash fast: going branchless&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#drop-sanity-checks" &gt;Drop sanity checks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#drop-bound-checks" &gt;Drop bound checks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#efficiently-collecting-to-a-vector" &gt;Efficiently collecting to a vector&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.4&lt;/span&gt; &lt;a href="#rolling-a-bit-less" &gt;Rolling a bit less&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#analysing-the-assembly-code" &gt;Analysing the assembly code&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.5&lt;/span&gt; &lt;a href="#parallel-it-is" &gt;Parallel it is&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#more-parallel" &gt;More parallel&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.6&lt;/span&gt; &lt;a href="#actual-simd-at-last" &gt;Actual SIMD, at last&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#simd-table-lookups" &gt;SIMD table lookups&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#32-bit-hashes" &gt;32-bit hashes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#shared-offsets" &gt;Shared offsets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.7&lt;/span&gt; &lt;a href="#simd-the-gathering" &gt;SIMD: The Gathering&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#gathering-4-characters-at-a-time" &gt;Gathering 4 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gathering-8-characters-at-a-time" &gt;Gathering 8 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gathering-32-characters-at-a-time" &gt;Gathering 32 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#reusing-the-gathers" &gt;Reusing the gathers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.8&lt;/span&gt; &lt;a href="#cached-vec" &gt;Fixing the benchmark&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#one-last-branch" &gt;One last branch&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.9&lt;/span&gt; &lt;a href="#analysis-machine-code-analysis" &gt;Analysis: Machine code analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.10&lt;/span&gt; &lt;a href="#finals-thoughts" &gt;Finals thoughts&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#doubling-down-again" &gt;Doubling down again&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#16-bit-hashes" &gt;16-bit hashes?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#what-about-a-simple-multiply-hash" &gt;What about a simple multiply hash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#simd-sliding-window" &gt;SIMD sliding window&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1&lt;/span&gt; &lt;a href="#sliding-window-results" &gt;Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#human-genome-results" &gt;Human genome results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#extending-into-something-useful" &gt;Extending into something useful&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.1&lt;/span&gt; &lt;a href="#collecting-minimizer-positions" &gt;Collecting minimizer positions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.2&lt;/span&gt; &lt;a href="#deduplicating-the-minimizer-positions" &gt;Deduplicating the minimizer positions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.3&lt;/span&gt; &lt;a href="#super-k-mers" &gt;Super-k-mers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.4&lt;/span&gt; &lt;a href="#canonical-k-mers" &gt;Canonical k-mers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#nthash" &gt;NtHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#leftmost-rightmost-sliding-min" &gt;Leftmost-rightmost sliding min&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#tiebreaking" &gt;Tiebreaking&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#further-reusing-iterated-bases" &gt;Further reusing iterated bases&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.5&lt;/span&gt; &lt;a href="#antilex-hash" &gt;AntiLex hash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#conclusion" &gt;Conclusion&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.1&lt;/span&gt; &lt;a href="#future-work" &gt;Future work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;A 90 min recording of a talk I gave on this post can be found &lt;a href="https://curiouscoding.nl/talks/minimizer-talk.mp4" &gt;here&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>A*PA2: Up to 19x faster exact global alignment</title><link>https://curiouscoding.nl/posts/astarpa2/</link><pubDate>Sat, 23 Mar 2024 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/astarpa2/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#abstract" &gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#contributions" &gt;Contributions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#previous-work" &gt;Previous work&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2.1&lt;/span&gt; &lt;a href="#needleman-wunsch" &gt;Needleman-Wunsch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2.2&lt;/span&gt; &lt;a href="#graph-algorithms" &gt;Graph algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2.3&lt;/span&gt; &lt;a href="#computational-volumes" &gt;Computational volumes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2.4&lt;/span&gt; &lt;a href="#parallelism" &gt;Parallelism&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2.5&lt;/span&gt; &lt;a href="#tools" &gt;Tools&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#preliminaries" &gt;Preliminaries&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#methods" &gt;Methods&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#band-doubling" &gt;Band-doubling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#blocks" &gt;Blocks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#memory" &gt;Memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#simd" &gt;SIMD&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#simd-friendly-sequence-profile" &gt;SIMD-friendly sequence profile&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#traceback" &gt;Traceback&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7&lt;/span&gt; &lt;a href="#a" &gt;A*&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7.1&lt;/span&gt; &lt;a href="#bulk-contours-update" &gt;Bulk-contours update&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7.2&lt;/span&gt; &lt;a href="#pre-pruning" &gt;Pre-pruning&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.8&lt;/span&gt; &lt;a href="#determining-the-rows-to-compute" &gt;Determining the rows to compute&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.8.1&lt;/span&gt; &lt;a href="#sparse-heuristic-invocation" &gt;Sparse heuristic invocation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.9&lt;/span&gt; &lt;a href="#incremental-doubling" &gt;Incremental doubling&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#results" &gt;Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#setup" &gt;Setup&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#comparison-with-other-aligners" &gt;Comparison with other aligners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#effects-of-methods" &gt;Effects of methods&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#acknowledgements" &gt;Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#conflict-of-interest" &gt;Conflict of interest&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#appendix" &gt;Appendix&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1&lt;/span&gt; &lt;a href="#bitpacking" &gt;Bitpacking&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.2&lt;/span&gt; &lt;a href="#app-comparison" &gt;Comparison with other aligners&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.3&lt;/span&gt; &lt;a href="#app-effects" &gt;Effects of methods&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\begin{equation*}
\newcommand{\g}{g^*}
\newcommand{\h}{h^*}
\newcommand{\f}{f^*}
\newcommand{\cgap}{c_{\textrm{gap}}}
\newcommand{\xor}{\ \mathrm{xor}\ }
\newcommand{\and}{\ \mathrm{and}\ }
\newcommand{\st}[2]{\langle #1, #2\rangle}
\newcommand{\matches}{\mathcal M}
\end{equation*}&lt;/p&gt;</description></item><item><title>String algorithm visualizations</title><link>https://curiouscoding.nl/posts/alg-viz/</link><pubDate>Tue, 08 Nov 2022 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/alg-viz/</guid><description>&lt;ol&gt;
&lt;li&gt;Select the algorithm to visualize&lt;/li&gt;
&lt;li&gt;Click the buttons, or click the canvas and use the indicated keys&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Suffix-array construction is explained &lt;a href="https://curiouscoding.nl/posts/suffix-array-construction/" &gt;here&lt;/a&gt; and BWT is explained &lt;a href="https://curiouscoding.nl/posts/bwt/" &gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Source code is &lt;a href="https://github.com/RagnarGrootKoerkamp/alg-viz" class="external-link" target="_blank" rel="noopener"&gt;on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;script defer src="https://curiouscoding.nl/js/alg-viz.js" type="module"&gt;&lt;/script&gt;&lt;/head&gt;
&lt;div class="controls"&gt;
&lt;label for="algorithm"&gt;Algorithm&lt;/label&gt;
&lt;select name="algorithm" id="algorithm"&gt;
 &lt;option value="suffix-array"&gt;Suffix Array Construction&lt;/option&gt;
 &lt;option value="bwt"&gt;Burrows-Wheeler Transform&lt;/option&gt;
 &lt;option value="bibwt"&gt;Bidirectional BWT&lt;/option&gt;
&lt;/select&gt;
&lt;br/&gt;
&lt;label for="string"&gt;String&lt;/label&gt; &lt;input type="string" name="string" id="string"/&gt;&lt;br/&gt;
&lt;label for="query"&gt;Query&lt;/label&gt; &lt;input type="string" name="query" id="query"/&gt;&lt;br/&gt;
&lt;button class="button-primary" id="prev"&gt;prev (←/backspace)&lt;/button&gt;
&lt;button class="button-primary" id="next"&gt;next (→/space)&lt;/button&gt;
&lt;br/&gt;
&lt;label for="delay"&gt;Delay (s)&lt;/label&gt; &lt;input type="number" name="delay" id="delay" value="0.8"/&gt;&lt;br/&gt;
&lt;button class="button-primary" id="faster"&gt;faster (↑/+/f)&lt;/button&gt;
&lt;button class="button-primary" id="slower"&gt;slower (↓/-/s)&lt;/button&gt;
&lt;button class="button-primary" id="pauseplay"&gt;pause/play (p/return)&lt;/button&gt;
&lt;/div&gt;
&lt;div class="canvas"&gt;
&lt;canvas id="canvas" tabindex='1' width="1600" height="1200"&gt;&lt;/canvas&gt;
&lt;/div&gt;</description></item><item><title>28000x speedup with Numba.CUDA</title><link>https://curiouscoding.nl/posts/numba-cuda-speedup/</link><pubDate>Mon, 24 May 2021 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/numba-cuda-speedup/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#cuda-overview" &gt;CUDA Overview&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#profiling" &gt;Profiling&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#optimizing-tensor-sketch" &gt;Optimizing Tensor Sketch&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#cpu-code" &gt;CPU code&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#v0-original-python-code" &gt;V0: Original python code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v1-numba" &gt;V1: Numba&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v2-multithreading" &gt;V2: Multithreading&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#gpu-code" &gt;GPU code&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#v3-a-first-gpu-version" &gt;V3: A first GPU version&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v4-parallel-kernel-invocations" &gt;V4: Parallel kernel invocations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v5-single-kernel-with-many-blocks" &gt;V5: Single kernel with many blocks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v6-detailed-profiling-kernel-compute" &gt;V6: Detailed profiling: Kernel Compute&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v7-detailed-profiling-kernel-latency" &gt;V7: Detailed profiling: Kernel Latency&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v8-detailed-profiling-shared-memory-access-pattern" &gt;V8: Detailed profiling: Shared Memory Access Pattern&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v9-more-work-per-thread" &gt;V9: More work per thread&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v10-cache-seq-to-shared-memory" &gt;V10: Cache seq to shared memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v11-hashes-and-signs-in-shared-memory" &gt;V11: Hashes and signs in shared memory&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v12-revisiting-blocks-per-kernel" &gt;V12: Revisiting blocks per kernel&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v13-passing-a-tuple-of-sequences" &gt;V13: Passing a tuple of sequences&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v14-better-hardware" &gt;V14: Better hardware&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#v15-dynamic-shared-memory" &gt;V15: Dynamic shared memory&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#wrap-up" &gt;Wrap up&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;&lt;strong&gt;Xrefs:&lt;/strong&gt; &lt;a href="https://www.reddit.com/r/CUDA/comments/mq1yrm/28000x_speedup_with_numbacuda/" class="external-link" target="_blank" rel="noopener"&gt;r/CUDA&lt;/a&gt;, &lt;a href="https://numba.discourse.group/t/blog-28000x-speedup-with-numba-cuda/667" class="external-link" target="_blank" rel="noopener"&gt;Numba discourse&lt;/a&gt;&lt;/p&gt;</description></item></channel></rss>