Sassy 1 & 2: Fuzzy searching DNA using SIMD
Previously Link to heading


Previously Link to heading



Long-read global alignment is just not a problem people have… Link to heading
… Sassy: 100bp semi-global alignment Link to heading
Takeaway 1: Math yay, Bio nay Link to heading
Takeaway 2: Do simple stuff fast Link to heading
Approximate String Matching Link to heading
- Find pattern \(P\) in text \(T\).
- \(m := |P|\), \(n := |T|\)
- Allow up to \(k\) unit-cost errors.
- Find all matches.
- IUPAC support: Handle
ACTG,N,YR...
- Not semi-global alignment
- Not \(\Theta(nm/w)\)
- Not new: Lots and LOTS of old literature
Intricacies of ASM: What is a “match”? Link to heading
- The end position?
- A start and end position?
- An alignment?
Intricacies of ASM: Which “matches” to report? Link to heading
The end position?
- One endpoint? (Semi-global)
- All endpoints? (Navarro'01)
- Local minima! (Sassy)
A start and end position?
- One pair per endpoint?
- All pairs?
An alignment?
- All alignments?
- One per endpoint?
- But which one??????
- In-progress PR by Fulcrum Genomics:
iterate over all “reasonable” alignments
ASM is not reverse-complement invariant! Link to heading
Overhang: when \(P\) extends beyond \(T\) Link to heading
ASM in \(O(\lceil n/w\rceil k)\) Link to heading
- Myers’ bitpacking on \(w=64\)-bit blocks.
- “Early break” when all states in block have cost \(>k\), after \(\approx 2k\) rows.
- Horizontal tiling because \(k\ll w\) makes \(O(n \lceil k/w\rceil)\) inefficient.
SIMD: Chunking the text Link to heading
- Split the text in 4 chunks.
- Process 1 chunk per SIMD-lane.
- The blue blocks are all evaluated at the same time!
Sassy 1: search long text at \({>}1\) Gbp/s Link to heading
- Params:
- \(20 \leq |P| \leq 1000\), \(n=10^5\)
- 1 thread, fwd search only
- Results:
- k=3: 1.1 Gbp/s
- k=20: 600 Mbp/s
- 10\(\times\) Edlib, 128\(\times\) parasail
Applications:
- Crispr off-target searching
- Many short patterns, fixed reference → Columba (Renders+ 2025)
- Barcode detection for demultiplexing
- Many short patterns, reads → Batching!
CLI: sassy grep -p <pattern> -k 3 <hg>.fa
Link to heading

Sassy 2: batching \(\gg\) chunking Link to heading
- Text chunking requires
gatheringand transposing - Take pattern suffix of len \(w’\in\{16,32\}\)
- Random DNA has edit distance \(\approx 45\%\).
- \(w’>2k+\varepsilon\) ensures few spurious matches.
- One suffix per SIMD lane.
- Post-process if suffix matches with cost \(\leq k\).
Sassy 2: search at 8 Gbp/s Link to heading
- A: 128 patterns of len 23, 1 thread, \(k\in \{0,3,4\}\). B: \(n=10^5\), \(k=3\).
Sassy 2 Application: Crispr off-target & Barbell Link to heading
- Search a 23bp guide RNA in a human genome with \(k=3\) in 30ms or amortized >100 Gbp/s on 16 threads.
- 3.7\(×\) Sassy1, 35\(×\) Edlib
- Scan Nanopore reads for 96 ONT rapid barcodes with \(k=3\) at 1.2 Gbp/s!
- 4.6\(×\) Sassy1, 45\(×\) Edlib
- Barbell (Beeloo+) is a fast & accurate demultiplexer using Sassy2.
- Poster @ RECOMB tomorrow evening!