Software
For all my projects, feel free to create issues and/or reach out for help in using them. My work is more on algorithm development rather than direct bioinformatics applications, and so I appreciate getting in contact with potential users :)
Foundational Crates:
- packed_seq: Slowly growing library for managing 2-bit encoded
ACTG
DNA sequences. - simd-minimizers: SIMD-based implementation of random minimizers.
- Builds on
packed_seq
. - No good support yet for non-
ACTG
characters.
- Builds on
- cacheline_ef: Elias-Fano encoding, one cacheline at a time.
Libraries & Tools:
- A*PA2: Global pairwise alignment based on SIMD, bitpacking, and band-doubling
- Reliable, but only supports
ACTG
input.
- Reliable, but only supports
- PtrHash: A fast minimal perfect hash function.
- Reliable, but randomized construction remains slightly annoying.
- simd-sketch: SIMD-based bottom and bucket sketches.
- Status: basic version done; could use polishing.
- Builds on
simd-minimizers
.
- Sassy: SIMD-based approximate string matching
- Status: in development.
- Search short (~32, or up to 1kbp) patterns in long texts.
- Supports
ACTG
, IUPAC, and ASCII.
Experimental Research Projects:
- Minimizers: reference implementations and experiments for minimizer and sampling schemes.