Software

For all my projects, feel free to create issues and/or reach out for help in using them. My work is more on algorithm development rather than direct bioinformatics applications, and so I appreciate getting in contact with potential users :)

Tools Link to heading

Tools building on this Link to heading

Libraries Link to heading

  • pa_types: type definitions for pairwise alignment coordinates, and Cigar utilities.

DNA bitpacking Link to heading

Crates using 2-bit packed sequence representations and SIMD algorithms on top of them. All developed together with Igor Martayan as part of simd-minimizers (Groot Koerkamp and Martayan 2025).

  • packed_seq: Slowly growing library for managing 2-bit encoded ACTG DNA sequences.
    • Also supports 2+1 bit encoding to indicate ambiguous (N) characters.
  • seq-hash: Streams over all nt-hashes (or other hashes) of a packed sequences.
  • simd-minimizers: compute minimizers of a sequence.
    • Supports skipping over ambiguous windows.

Dependency graph Link to heading

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
digraph {
barbell -> sassy
barbell -> pa_types
sassy -> pa_types
deacon -> simd_minimizers
deacon -> packed_seq
simd_minimizers -> seq_hash
simd_minimizers -> packed_seq
seq_hash -> packed_seq
simd_sketch -> packed_seq
simd_sketch -> seq_hash
"A*PA2" -> pa_types
}

Data structures Link to heading

Experimental/incomplete Link to heading

  • Minimizers: reference implementations and experiments for minimizer and sampling schemes.
  • sshash-rs: quick but incomplete reimplementation of SSHash.
  • suffix array searching (blog): static search trees are faster than binary search.

References Link to heading

Beeloo, Rick, Ragnar Groot Koerkamp, Xiu Jia, Marian J. Broekhuizen-Stins, Lieke van IJken, Els M. Broens, Aldert Zomer, and Bas E. Dutilh. 2025. “Barbell Resolves Demultiplexing and Trimming Issues in Nanopore Data,” October. https://doi.org/10.1101/2025.10.22.683865.
Beeloo, Rick, and Ragnar Groot Koerkamp. 2025. “Sassy: Searching Short DNA Strings in the 2020s,” July. https://doi.org/10.1101/2025.07.22.666207.
Constantinides, Bede, John Lees, and Derrick W Crook. 2025. “Deacon: Fast Sequence Filtering and Contaminant Depletion,” June. https://doi.org/10.1101/2025.06.09.658732.
Groot Koerkamp, Ragnar. 2024. “A*PA2: Up to 19 Faster Exact Global Alignment.” In Wabi 2024, 312:17:1–17:25. Lipics. https://doi.org/10.4230/LIPIcs.WABI.2024.17.
———. 2025. “PtrHash: Minimal Perfect Hashing at RAM Throughput.” In SEA 2025, 338:21:1–21:21. Lipics. https://doi.org/10.4230/LIPIcs.SEA.2025.21.
Groot Koerkamp, Ragnar, and Pesho Ivanov. 2024. “Exact Global Alignment Using A* with Chaining Seed Heuristic and Match Pruning.” Edited by Tobias Marschall. Bioinformatics 40 (3). https://doi.org/10.1093/bioinformatics/btae032.
Groot Koerkamp, Ragnar, and Igor Martayan. 2025. “SimdMinimizers: Computing Random Minimizers, fast.” In SEA 2025, 338:20:1–20:19. Lipics. https://doi.org/10.4230/LIPIcs.SEA.2025.20.
Patro, Rob, Siddhant Bharti, Prajwal Singhania, Rakrish Dhakal, Thomas J. Dahlstrom, and Ragnar Groot Koerkamp. 2025. “Mim: A Lightweight Auxiliary Index to Enable Fast, Parallel, Gzipped Fastq Parsing,” November. https://doi.org/10.1101/2025.11.24.690271.