<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Sketching on CuriousCoding</title><link>https://curiouscoding.nl/tags/sketching/</link><description>Recent content in Sketching on CuriousCoding</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 09 Mar 2025 00:00:00 +0100</lastBuildDate><atom:link href="https://curiouscoding.nl/tags/sketching/index.xml" rel="self" type="application/rss+xml"/><item><title>SimdSketch: a fast bucket sketch</title><link>https://curiouscoding.nl/posts/simd-sketch/</link><pubDate>Sun, 09 Mar 2025 00:00:00 +0100</pubDate><guid>https://curiouscoding.nl/posts/simd-sketch/</guid><description>&lt;div class="ox-hugo-toc toc has-section-numbers"&gt;
&lt;div class="heading"&gt;Table of Contents&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;1&lt;/span&gt; &lt;a href="#jaccard-similarity" &gt;Jaccard similarity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2&lt;/span&gt; &lt;a href="#hash-schemes" &gt;Hash schemes&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.1&lt;/span&gt; &lt;a href="#minhash" &gt;MinHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.2&lt;/span&gt; &lt;a href="#s-mins-sketch" &gt;$s$-mins sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.3&lt;/span&gt; &lt;a href="#bottom-s" &gt;Bottom-\(s\) sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.4&lt;/span&gt; &lt;a href="#fracminhash" &gt;FracMinHash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.5&lt;/span&gt; &lt;a href="#bucket-sketch" &gt;Bucket sketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.6&lt;/span&gt; &lt;a href="#mod-bucket-hash--new" &gt;Mod-bucket hash (new?)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;2.7&lt;/span&gt; &lt;a href="#variants" &gt;Variants&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3&lt;/span&gt; &lt;a href="#compressing-sketches" &gt;Compressing sketches&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1&lt;/span&gt; &lt;a href="#b-bit-hashing" &gt;$b$-bit hashing&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.1.1&lt;/span&gt; &lt;a href="#accounting-for-collisions" &gt;Accounting for collisions&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;3.2&lt;/span&gt; &lt;a href="#hyperminhash" &gt;HyperMinHash&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;4&lt;/span&gt; &lt;a href="#densification-strategies" &gt;Densification strategies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;5&lt;/span&gt; &lt;a href="#simdsketch" &gt;SimdSketch&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6&lt;/span&gt; &lt;a href="#evaluation" &gt;Evaluation&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1&lt;/span&gt; &lt;a href="#setup" &gt;Setup&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.1&lt;/span&gt; &lt;a href="#tools" &gt;Tools&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.2&lt;/span&gt; &lt;a href="#inputs" &gt;Inputs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.3&lt;/span&gt; &lt;a href="#parameters" &gt;Parameters&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.1.4&lt;/span&gt; &lt;a href="#metrics" &gt;Metrics&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.2&lt;/span&gt; &lt;a href="#raw-results" &gt;Raw results&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.3&lt;/span&gt; &lt;a href="#correlation" &gt;Correlation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.4&lt;/span&gt; &lt;a href="#comparison-speed" &gt;Comparison speed&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;6.5&lt;/span&gt; &lt;a href="#low-similarity-data" &gt;Low-similarity data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;7&lt;/span&gt; &lt;a href="#discussion" &gt;Discussion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="section-num"&gt;8&lt;/span&gt; &lt;a href="#future-work" &gt;&lt;span class="org-todo todo TODO"&gt;TODO&lt;/span&gt; / Future work&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;!--endtoc--&gt;
&lt;p&gt;\[
\newcommand{\sketch}{\mathsf{sketch}}
\]&lt;/p&gt;</description></item></channel></rss>