<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Minimizers on CuriousCoding</title><link>https://curiouscoding.nl/tags/minimizers/</link><description>Recent content in Minimizers on CuriousCoding</description><generator>Hugo</generator><language>en</language><lastBuildDate>Tue, 17 Feb 2026 00:00:00 +0100</lastBuildDate><atom:link href="https://curiouscoding.nl/tags/minimizers/index.xml" rel="self" type="application/rss+xml"/><item><title>Understanding GreedyMini</title><link>https://curiouscoding.nl/posts/greedymini-analysis/</link><pubDate>Sun, 27 Apr 2025 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/greedymini-analysis/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#greedymini-results" &gt;GreedyMini Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#a-first-look" &gt;A first look&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#comparison-with-optimal-ilp-values" &gt;Comparison with optimal ILP values&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#large-alphabets" &gt;Large alphabets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#analysing-greedymini-at-w-3" &gt;Analysing GreedyMini at \(w=3\)&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#w-3-k-3" &gt;\(w=3\), \(k=3\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#w-7-k-3" &gt;\(w=7\), \(k=3\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#w-3-k-4" &gt;\(w=3\), \(k=4\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#w-3-k-5" &gt;\(w=3\), \(k=5\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#w-3-k-6" &gt;\(w=3\), \(k=6\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#w-3-k-7" &gt;\(w=3\), \(k=7\)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#looking-at-fixed-k-5" &gt;Looking at fixed \(k=5\)&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#k-5-w-4" &gt;\(k=5\), \(w=4\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-5-w-5" &gt;\(k=5\), \(w=5\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-5-w-6" &gt;\(k=5\), \(w=6\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-5-w-7" &gt;\(k=5\), \(w=7\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-5-w-8" &gt;\(k=5\), \(w=8\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-5-w-12" &gt;\(k=5\), \(w=12\)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#investigating-w-5" &gt;Investigating \(w=5\)&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#w-5-k-8" &gt;\(w=5\), \(k=8\)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#what-about-k-w-plus-1" &gt;What about \(k = w+1\)?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;In this post, we will look at the minimizer schemes generated by the greedy
minimizer (Golan et al. 2025).&lt;/p&gt;</description></item><item><title>Low Density Minimizers</title><link>https://curiouscoding.nl/posts/minimizers/</link><pubDate>Tue, 08 Apr 2025 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/minimizers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#theory-of-sampling-schemes" &gt;Theory of Sampling Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.3&lt;/span&gt; &lt;a href="#theory-of-sampling-schemes" &gt;Theory of sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.4&lt;/span&gt; &lt;a href="#notation" &gt;Notation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.5&lt;/span&gt; &lt;a href="#types-of-sampling-schemes" &gt;Types of sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.6&lt;/span&gt; &lt;a href="#computing-the-density" &gt;Computing the density&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.7&lt;/span&gt; &lt;a href="#random-mini-density" &gt;The density of random minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.8&lt;/span&gt; &lt;a href="#universal-hitting-sets" &gt;Universal hitting sets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.9&lt;/span&gt; &lt;a href="#asymptotic-results" &gt;Asymptotic results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.10&lt;/span&gt; &lt;a href="#variants" &gt;Variants&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#lower-bounds" &gt;Lower Bounds on Sampling Scheme Density&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#schleimer-et-al-dot-s-bound" &gt;Schleimer et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#mar%c3%a7ais-et-al-dot-s-bound" &gt;Marçais et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#improving-and-extending-mar%c3%a7ais-et-al-dot-s-bound" &gt;Improving and extending Marçais et al.&amp;rsquo;s bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#near-tight-lb" &gt;A near-tight lower bound on the density of forward sampling schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#lower-bound-eval" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#sampling-schemes" &gt;Practical Sampling Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#lexmin" &gt;Variants of lexicographic minimizers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#lex-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#uhs-inspired-schemes" &gt;UHS-inspired schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#syncmer-based-schemes" &gt;Syncmer-based schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#open-closed-minimizer" &gt;Open-closed minimizer&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#oc-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#modmini" &gt;Mod-minimizer&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#theoretical-density" &gt;Theoretical density&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#modmini-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#sampling-schemes-discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#selection-schemes" &gt;Towards Optimal Selection Schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#bd-anchors" &gt;Bidirectional anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#sus-anchors" &gt;Sus-anchors&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#sus-anchor-eval" &gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#selection-schemes-discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This is Part 2 of my &lt;a href="https://curiouscoding.nl/posts/thesis/" &gt;thesis&lt;/a&gt; (&lt;a href="#citeproc_bib_item_15"&gt;Groot Koerkamp 2025&lt;/a&gt;), containing chapters 6 to 9 on Low Density Minimizers.
Please cite the thesis instead of this post.&lt;/p&gt;</description></item><item><title>Near-optimal sampling schemes</title><link>https://curiouscoding.nl/slides/minimizers-dsb25-text/</link><pubDate>Thu, 27 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/slides/minimizers-dsb25-text/</guid><description>&lt;script src="https://curiouscoding.nl/livereload.js?mindelay=10&amp;amp;v=2&amp;amp;port=1313&amp;amp;path=livereload" data-no-instant defer&gt;&lt;/script&gt;
&lt;h2 id="warmup"&gt;
 &lt;span class="section-num"&gt;1&lt;/span&gt; Warming up: A cute prolblem
 &lt;a class="heading-link" href="#warmup"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Given a string, choose one character.
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;CABAACBD&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Given a rotation, choose one character.
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ACBDCABA&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Can we always choose &lt;em&gt;the same&lt;/em&gt; character?&lt;/li&gt;
&lt;li&gt;Yes: e.g. the smallest rotation (bd-anchor):
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;CAB&lt;/code&gt;​&lt;font color="red"&gt;&lt;code&gt;A&lt;/code&gt;&lt;/font&gt;​&lt;code&gt;ACBD&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ACBDCAB&lt;/code&gt;​&lt;font color="red"&gt;&lt;code&gt;A&lt;/code&gt;&lt;/font&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="hidden"&gt;
 &lt;span class="section-num"&gt;1.1&lt;/span&gt; This talk: what if one character is &lt;em&gt;hidden&lt;/em&gt;?
 &lt;a class="heading-link" href="#hidden"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Given a string (length \(w\)), choose one character.
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;CABAACBD&lt;/code&gt;​​&lt;font color="lightgrey"&gt;&lt;code&gt;X&lt;/code&gt;&lt;/font&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Given a rotation (of the hidden \(w+1\) string), choose one character.
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ACBDXCAB&lt;/code&gt;​&lt;font color="lightgrey"&gt;&lt;code&gt;A&lt;/code&gt;&lt;/font&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Can we always choose &lt;em&gt;the same&lt;/em&gt; character?&lt;/li&gt;
&lt;li&gt;Maybe?
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;CAB&lt;/code&gt;​&lt;font color="red"&gt;&lt;code&gt;A&lt;/code&gt;&lt;/font&gt;​&lt;code&gt;ACBD&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ACBDXCAB&lt;/code&gt; 🤔&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="lb"&gt;
 &lt;span class="section-num"&gt;1.2&lt;/span&gt; The answer is no!
 &lt;a class="heading-link" href="#lb"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;C​ABAACBDX&lt;/code&gt; rotations:&lt;/p&gt;</description></item><item><title>Minimizer papers</title><link>https://curiouscoding.nl/posts/minimizer-papers/</link><pubDate>Mon, 17 Feb 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/minimizer-papers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
- &lt;a href="#previous-reviews" &gt;Previous reviews&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#theory-of-sampling-schemes" &gt;Theory of sampling schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#questions" &gt;Questions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#types-of-schemes" &gt;Types of schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#parameter-regimes" &gt;Parameter regimes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#different-perspectives" &gt;Different perspectives&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#uhs-vs-minimizer-scheme" &gt;UHS vs minimizer scheme&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#asymptotic--bounds" &gt;(Asymptotic) bounds&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7&lt;/span&gt; &lt;a href="#lower-bounds" &gt;Lower bounds&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#minimizer-schemes" &gt;Minimizer schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#orders" &gt;Orders&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#uhs-based-and-search-based-schemes" &gt;UHS-based and search-based schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#pure-schemes" &gt;Pure schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.4&lt;/span&gt; &lt;a href="#other-variants" &gt;Other variants&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#selection-schemes" &gt;Selection schemes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#canonical-minimizers" &gt;Canonical minimizers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.5&lt;/span&gt; &lt;a href="#non-overlapping-string-sets" &gt;Non-overlapping string sets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;This post is simply a list of brief comments on many papers related to
minimizers, and forms the basis of &lt;a href="https://curiouscoding.nl/posts/minimizers/" &gt;/posts/minimizers/&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Comments on Brisk</title><link>https://curiouscoding.nl/posts/brisk/</link><pubDate>Fri, 29 Nov 2024 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/brisk/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#detailed-comments" &gt;Detailed comments&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#general" &gt;General&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#abstract" &gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#1-dot-introduction" &gt;1. Introduction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-methods" &gt;2. Methods&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#2-dot-1-outline" &gt;2.1 Outline&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-2-indexing-super-k-mers" &gt;2.2 Indexing super-k-mers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-3-lazy-encoding" &gt;2.3 Lazy encoding&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-4-probing" &gt;2.4 Probing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-5-superbuckets" &gt;2.5 Superbuckets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#2-dot-6-implementation-details" &gt;2.6 Implementation details&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#3-dot-results" &gt;3. Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#3-dot-1-parameters" &gt;3.1 Parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#3-dot-2-multicore" &gt;3.2 Multicore&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#3-dot-4-comparison" &gt;3.4 Comparison&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#3-dot-5-query-times" &gt;3.5 Query times&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#4-dot-conclusion" &gt;4. Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;These are some (biased) comments on Brisk,
a dynamic k-mer dictionary (&lt;a href="#citeproc_bib_item_5"&gt;Smith et al. 2024&lt;/a&gt;).&lt;/p&gt;</description></item><item><title>Comments on GreedyMini</title><link>https://curiouscoding.nl/posts/greedymini/</link><pubDate>Mon, 04 Nov 2024 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/greedymini/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#overview" &gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#detailed-comments" &gt;Detailed comments&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#terminology" &gt;Terminology&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#abstract" &gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#preliminaries" &gt;Preliminaries&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#methods" &gt;Methods&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#3-dot-5-transformations" &gt;3.5 Transformations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#results" &gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#comments-on-expected-density-of-random-minimizers" &gt;Comments on &amp;ldquo;Expected density of random minimizers&amp;rdquo;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;These are some (biased) comments on &lt;a href="#citeproc_bib_item_2"&gt;“Greedymini: Generating Low-Density Dna Minimizers”&lt;/a&gt;
(&lt;a href="#citeproc_bib_item_2"&gt;Golan et al. 2024&lt;/a&gt;), which introduces the &lt;code&gt;GreedyMini&lt;/code&gt; minimizer scheme.
(Meanwhile, this has been published as Golan et al. (&lt;a href="#citeproc_bib_item_3"&gt;2025&lt;/a&gt;).)&lt;/p&gt;
&lt;p&gt;At the bottom, there are also some comment on Golan and Shur (&lt;a href="#citeproc_bib_item_4"&gt;2025&lt;/a&gt;).&lt;/p&gt;</description></item><item><title>Comments on 'When Less is More' minimizer review</title><link>https://curiouscoding.nl/posts/minimizer-review-comments/</link><pubDate>Tue, 15 Oct 2024 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/minimizer-review-comments/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#the-importance-of-ordering" &gt;The importance of ordering&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#asymptotically-optimal-minimizers" &gt;Asymptotically optimal minimizers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;These are some (biased) comments on &lt;a href="#citeproc_bib_item_5"&gt;“When Less Is More: Sketching with Minimizers in Genomics”&lt;/a&gt; (&lt;a href="#citeproc_bib_item_5"&gt;Ndiaye et al. 2024&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id="the-importance-of-ordering"&gt;
 The importance of ordering
 &lt;a class="heading-link" href="#the-importance-of-ordering"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;the interest lies in constructing a minimizer with a density within a constant
factor, i.e., \(O(1/w)\) for any \(k\). With lexicographic ordering, minimizers can
achieve such density, but with large \(k\) values (\(\geq \log_{|Σ|}(w)-c\) for a
constant \(c\)), which might not be desirable (&lt;a href="#citeproc_bib_item_9"&gt;Zheng, Kingsford, and Marçais 2020&lt;/a&gt;). However, random
ordering can result in a lower density than that of the lexicographic ordering.
Thus, random ordering (implemented with pseudo-random hash functions) is
usually used in practice.&lt;/p&gt;</description></item><item><title>Practical minimizers</title><link>https://curiouscoding.nl/posts/practical-minimizers/</link><pubDate>Thu, 12 Sep 2024 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/practical-minimizers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#sampling-schemes" &gt;Sampling schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#definitions" &gt;Definitions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.2&lt;/span&gt; &lt;a href="#miniception" &gt;Miniception&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.3&lt;/span&gt; &lt;a href="#mod-minimizer" &gt;Mod-minimizer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.4&lt;/span&gt; &lt;a href="#forward-scheme-lower-bound" &gt;Forward scheme lower bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.5&lt;/span&gt; &lt;a href="#open-syncmer-minimizer" &gt;Open syncmer minimizer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.6&lt;/span&gt; &lt;a href="#open-closed-minimizer" &gt;Open-closed minimizer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.7&lt;/span&gt; &lt;a href="#new-general-mod-minimizer" &gt;New: General mod-minimizer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.8&lt;/span&gt; &lt;a href="#variant-open-closed-minimizer-using-offsets" &gt;Variant: Open-closed minimizer using offsets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#selection-schemes" &gt;Selection schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#definition" &gt;Definition&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#bd-anchors" &gt;Bd-anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#new-smallest-unique-substring-anchors" &gt;New: Smallest unique substring anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#new-anti-lexicographic-sorting" &gt;New: Anti lexicographic sorting&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#more-sampling-schemes" &gt;More sampling schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#anti-lex-sus-anchors" &gt;Anti-lex sus-anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#threshold-anchors" &gt;Threshold anchors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#the-t-gap-disappears-for-large-alphabets" &gt;The $t$-gap disappears for large alphabets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#computing-the-density-of-forward-schemes" &gt;Computing the density of forward schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#wip-anti-lexicographic-sus-anchor-density" &gt;WIP: Anti lexicographic sus-anchor density&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#open-questions" &gt;Open questions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#ideas" &gt;Ideas&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#optimal-schemes-for-k-in-w-w-plus-1" &gt;Optimal schemes for \(k \in \{w, w+1\}\)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;&lt;strong&gt;Most of the content here has now been absorbed into my &lt;a href="https://curiouscoding.nl/posts/minimizers/" &gt;thesis chapter on minimizers&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;</description></item><item><title>SimdMinimizers: Computing random minimizers, fast</title><link>https://curiouscoding.nl/posts/simd-minimizers/</link><pubDate>Fri, 12 Jul 2024 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/simd-minimizers/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#introduction" &gt;Introduction&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1.1&lt;/span&gt; &lt;a href="#intro-results" &gt;Results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#random-minimizers" &gt;Random minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#algorithms" &gt;Algorithms&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#problem-statement" &gt;Problem statement&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#problem-a-only-the-set-of-minimizers" &gt;Problem A: Only the set of minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#problem-b-the-minimizer-of-each-window" &gt;Problem B: The minimizer of each window&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#problem-c-super-k-mers" &gt;Problem C: Super-k-mers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#which-problem-to-solve" &gt;Which problem to solve&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#canonical-k-mers" &gt;Canonical k-mers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#the-naive-algorithm" &gt;The naive algorithm&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#naive-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.3&lt;/span&gt; &lt;a href="#rephrasing-as-sliding-window-minimum" &gt;Rephrasing as sliding window minimum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.4&lt;/span&gt; &lt;a href="#the-queue" &gt;The queue&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#queue-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.5&lt;/span&gt; &lt;a href="#jumping-away-with-the-queue" &gt;Jumping: Away with the queue&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#jumping-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.6&lt;/span&gt; &lt;a href="#re-scan" &gt;Re-scan&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#rescan-performance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.7&lt;/span&gt; &lt;a href="#split-windows" &gt;Split windows&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#split-perfomance" &gt;Performance characteristics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#analysing-what-we-have-so-far" &gt;Analysing what we have so far&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.1&lt;/span&gt; &lt;a href="#counting-comparisons" &gt;Counting comparisons&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#open-problem-theoretical-lower-bounds" &gt;Open problem: Theoretical lower bounds&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.2&lt;/span&gt; &lt;a href="#setting-up-benchmarking" &gt;Setting up benchmarking&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#adding-criterion" &gt;Adding criterion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#making-criterion-fast" &gt;Making criterion fast&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-note-on-cpu-frequency" &gt;A note on CPU frequency&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.3&lt;/span&gt; &lt;a href="#runtime-comparison-with-other-implementations" &gt;Runtime comparison with other implementations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.4&lt;/span&gt; &lt;a href="#deeper-inspection-using-perf-stat" &gt;Deeper inspection using &lt;code&gt;perf stat&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.5&lt;/span&gt; &lt;a href="#a-first-optimization-pass" &gt;A first optimization pass&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#optimizing-buffered-reducing-branch-misses" &gt;Optimizing &lt;code&gt;Buffered&lt;/code&gt;: reducing branch misses&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#queue-is-hopelessly-branchy" &gt;&lt;code&gt;Queue&lt;/code&gt; is hopelessly branchy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#jumping-is-already-very-efficient" &gt;&lt;code&gt;Jumping&lt;/code&gt; is already very efficient&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#optimizing-rescan" &gt;Optimizing &lt;code&gt;Rescan&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#optimizing-split" &gt;Optimizing &lt;code&gt;Split&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4.6&lt;/span&gt; &lt;a href="#a-new-performance-comparison" &gt;A new performance comparison&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#rolling-our-own-hash" &gt;Rolling our own hash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.1&lt;/span&gt; &lt;a href="#fxhash" &gt;FxHash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#wyhash" &gt;WyHash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.2&lt;/span&gt; &lt;a href="#nthash-a-rolling-hash" &gt;NtHash: a rolling hash&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#the-nthash-crate" &gt;The &lt;code&gt;nthash&lt;/code&gt; crate&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#buffered-hash-values" &gt;Buffered hash values&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.3&lt;/span&gt; &lt;a href="#making-nthash-fast-going-branchless" &gt;Making ntHash fast: going branchless&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#drop-sanity-checks" &gt;Drop sanity checks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#drop-bound-checks" &gt;Drop bound checks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#efficiently-collecting-to-a-vector" &gt;Efficiently collecting to a vector&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.4&lt;/span&gt; &lt;a href="#rolling-a-bit-less" &gt;Rolling a bit less&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#analysing-the-assembly-code" &gt;Analysing the assembly code&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.5&lt;/span&gt; &lt;a href="#parallel-it-is" &gt;Parallel it is&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#more-parallel" &gt;More parallel&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.6&lt;/span&gt; &lt;a href="#actual-simd-at-last" &gt;Actual SIMD, at last&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#simd-table-lookups" &gt;SIMD table lookups&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#32-bit-hashes" &gt;32-bit hashes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#shared-offsets" &gt;Shared offsets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.7&lt;/span&gt; &lt;a href="#simd-the-gathering" &gt;SIMD: The Gathering&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#gathering-4-characters-at-a-time" &gt;Gathering 4 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gathering-8-characters-at-a-time" &gt;Gathering 8 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#gathering-32-characters-at-a-time" &gt;Gathering 32 characters at a time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#reusing-the-gathers" &gt;Reusing the gathers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.8&lt;/span&gt; &lt;a href="#cached-vec" &gt;Fixing the benchmark&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#one-last-branch" &gt;One last branch&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.9&lt;/span&gt; &lt;a href="#analysis-machine-code-analysis" &gt;Analysis: Machine code analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5.10&lt;/span&gt; &lt;a href="#finals-thoughts" &gt;Finals thoughts&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#doubling-down-again" &gt;Doubling down again&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#16-bit-hashes" &gt;16-bit hashes?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#what-about-a-simple-multiply-hash" &gt;What about a simple multiply hash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#simd-sliding-window" &gt;SIMD sliding window&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1&lt;/span&gt; &lt;a href="#sliding-window-results" &gt;Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#human-genome-results" &gt;Human genome results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#extending-into-something-useful" &gt;Extending into something useful&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.1&lt;/span&gt; &lt;a href="#collecting-minimizer-positions" &gt;Collecting minimizer positions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.2&lt;/span&gt; &lt;a href="#deduplicating-the-minimizer-positions" &gt;Deduplicating the minimizer positions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.3&lt;/span&gt; &lt;a href="#super-k-mers" &gt;Super-k-mers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.4&lt;/span&gt; &lt;a href="#canonical-k-mers" &gt;Canonical k-mers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#nthash" &gt;NtHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#leftmost-rightmost-sliding-min" &gt;Leftmost-rightmost sliding min&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#tiebreaking" &gt;Tiebreaking&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#further-reusing-iterated-bases" &gt;Further reusing iterated bases&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7.5&lt;/span&gt; &lt;a href="#antilex-hash" &gt;AntiLex hash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#conclusion" &gt;Conclusion&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8.1&lt;/span&gt; &lt;a href="#future-work" &gt;Future work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;SimdMinimizers has been published as a paper: &lt;a href="https://doi.org/10.1101/2025.01.27.634998" class="external-link" target="_blank" rel="noopener"&gt;DOI&lt;/a&gt;, &lt;a href="https://curiouscoding.nl/papers/simd-minimizers.pdf" &gt;PDF&lt;/a&gt;:&lt;/p&gt;</description></item><item><title>A near-tight lower bound on minimizer density</title><link>https://curiouscoding.nl/posts/minimizer-lower-bound/</link><pubDate>Tue, 25 Jun 2024 00:00:00 +0200</pubDate><guid>https://curiouscoding.nl/posts/minimizer-lower-bound/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#succinct-background" &gt;Succinct background&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#definitions" &gt;Definitions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#lower-bounds" &gt;Lower bounds&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#a-new-lower-bound" &gt;A new lower bound&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#post-scriptum" &gt;Post scriptum&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#acknowledgement" &gt;Acknowledgement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;&lt;strong&gt;The results of this post are now published in Bioinformatics: &lt;a href="https://doi.org/10.1093/bioinformatics/btae736" class="external-link" target="_blank" rel="noopener"&gt;&lt;strong&gt;DOI&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://curiouscoding.nl/papers/sampling-lower-bound.pdf" &gt;&lt;strong&gt;PDF&lt;/strong&gt;&lt;/a&gt;:&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Kille, Bryce, Ragnar Groot Koerkamp, Drake McAdams, Alan Liu, and Todd J Treangen. 2024. “A near-tight Lower Bound on the Density of Forward Sampling Schemes.” Edited by Yann Ponty. &lt;i&gt;Bioinformatics&lt;/i&gt;, December. &lt;a href="https://doi.org/10.1093/bioinformatics/btae736"&gt;&lt;a href="https://doi.org/10.1093/bioinformatics/btae736" class="external-link" target="_blank" rel="noopener"&gt;https://doi.org/10.1093/bioinformatics/btae736&lt;/a&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This content has also been absorbed into my &lt;a href="https://curiouscoding.nl/posts/minimizers/" &gt;&lt;strong&gt;thesis chapter on minimizers&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Mod-minimizers and other minimizers</title><link>https://curiouscoding.nl/posts/mod-minimizers/</link><pubDate>Thu, 18 Jan 2024 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/mod-minimizers/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#applications" &gt;Applications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#background" &gt;Background&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#minimizers" &gt;Minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#density-bounds" &gt;Density bounds&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#robust-minimizers" &gt;Robust minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#pasha" &gt;PASHA&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#miniception" &gt;Miniception&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#closed-syncmers" &gt;Closed syncmers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#bd-anchors" &gt;Bd-anchors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#new-mod-minimizers" &gt;New: Mod-minimizers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#experiments" &gt;Experiments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#conclusion" &gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#small-k-experiments" &gt;Small k experiments&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#search-methods" &gt;Search methods&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#directed-minimizer" &gt;Directed minimizer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-1-w-2" &gt;\(k=1\), \(w=2\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-1-w-4" &gt;\(k=1\), \(w=4\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-1-w-5" &gt;\(k=1\), \(w=5\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-2-w-2" &gt;\(k=2\), \(w=2\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#k-2-w-4" &gt;\(k=2\), \(w=4\)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#notes" &gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#reading-list" &gt;Reading list&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\[
\newcommand{\d}{\mathrm{d}}
\newcommand{\L}{\mathcal{L}}
\]&lt;/p&gt;
&lt;p&gt;This post introduces some background for minimizers and some
experiments for a new minimizer variant. That new variant is now called the
&lt;em&gt;mod-minimizer&lt;/em&gt; and published at WABI24 (&lt;a href="https://doi.org/10.4230/LIPIcs.WABI.2024.11" class="external-link" target="_blank" rel="noopener"&gt;&lt;strong&gt;DOI&lt;/strong&gt;&lt;/a&gt;, &lt;a href="https://curiouscoding.nl/papers/modmini.pdf" &gt;&lt;strong&gt;PDF&lt;/strong&gt;&lt;/a&gt;) (&lt;a href="#citeproc_bib_item_5"&gt;Groot Koerkamp and Pibiri 2024&lt;/a&gt;). The paper
also includes a review of existing methods, including pseudocode for
most of the methods covered below.&lt;/p&gt;</description></item><item><title>Notes on bidirectional anchors</title><link>https://curiouscoding.nl/posts/bd-anchors/</link><pubDate>Mon, 15 Jan 2024 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/bd-anchors/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#paper-overview" &gt;Paper overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#remarks-on-the-paper" &gt;Remarks on the paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#thoughts" &gt;Thoughts&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\[
\newcommand{\A}{\mathcal{A}_\ell}
\newcommand{\T}{\mathcal{T}_\ell}
\]&lt;/p&gt;
&lt;p&gt;These are some notes on &lt;em&gt;Bidirectional String Anchors&lt;/em&gt; (&lt;a href="#citeproc_bib_item_2"&gt;Loukides, Pissis, and Sweering 2023&lt;/a&gt;), also
called &lt;em&gt;bd-anchors&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Loukides and Pissis (&lt;a href="#citeproc_bib_item_3"&gt;2021&lt;/a&gt;): preceding conference paper with subset of content.&lt;/li&gt;
&lt;li&gt;Loukides, Pissis, and Sweering (&lt;a href="#citeproc_bib_item_2"&gt;2023&lt;/a&gt;): The paper discussed here.&lt;/li&gt;
&lt;li&gt;Ayad, Loukides, and Pissis (&lt;a href="#citeproc_bib_item_1"&gt;2023&lt;/a&gt;): follow-up/second paper containing
&lt;ul&gt;
&lt;li&gt;a faster average-case \(O(n)\) construction algorithm;&lt;/li&gt;
&lt;li&gt;a more memory efficient construction algorithms for the index.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/solonas13/bd-anchors" class="external-link" target="_blank" rel="noopener"&gt;https://github.com/solonas13/bd-anchors&lt;/a&gt;: code for first paper&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/lorrainea/BDA-index" class="external-link" target="_blank" rel="noopener"&gt;https://github.com/lorrainea/BDA-index&lt;/a&gt;: code for follow-up paper&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The remainder of this post is split into &lt;a href="#paper-overview" &gt;an overview of the paper&lt;/a&gt;, &lt;a href="#remarks-on-the-paper" &gt;Remarks on the paper&lt;/a&gt;, and further &lt;a href="#thoughts" &gt;Thoughts&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Perfect NtHash for Robust Minimizers</title><link>https://curiouscoding.nl/posts/nthash/</link><pubDate>Sun, 31 Dec 2023 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/nthash/</guid><description>&lt;div class="ox-hugo-toc toc"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#nthash" &gt;NtHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#minimizers" &gt;Minimizers&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#robust-minimizers" &gt;Robust minimizers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#is-nthash-injective-on-kmers" &gt;Is NtHash injective on kmers?&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#searching-for-a-collision" &gt;Searching for a collision&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#proving-perfection" &gt;Proving perfection&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="#alternatives" &gt;Alternatives&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#smhasher-results" &gt;SmHasher results&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;h2 id="nthash"&gt;
 NtHash
 &lt;a class="heading-link" href="#nthash"&gt;
 &lt;i class="fa-solid fa-link" aria-hidden="true" title="Link to heading"&gt;&lt;/i&gt;
 &lt;span class="sr-only"&gt;Link to heading&lt;/span&gt;
 &lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;NtHash (&lt;a href="#citeproc_bib_item_3"&gt;Mohamadi et al. 2016&lt;/a&gt;) is a rolling hash suitable for hashing any kind of text, but made for DNA originally.
For a string of length \(k\) it is a \(64\) bit value computed as:&lt;/p&gt;</description></item></channel></rss>