This is a growing list of ambiguous terms and their definitions. More of a place to store random remarks than a complete reference for now.
- diagonal transition
- name introduced by Navarro (2001)
- approximate
approximate algorithm: an algorithms that does not always give the correct answer.
$k$-approximate string matching: variant semi-global alignment where we find all matches of a pattern in a reference with at most \(k\) mistakes.
Also approximate string matching: alternative name for global pairwise alignment.
- dynamic heuristic
- A heuristic that changes as the A* progresses. Not: online heuristic.
- heuristic
- A* heuristic: A function \(h\) that provides a lower bound (when admissible) on the distance from the current state to the end.
- heuristic method: An approximate method, that approximates the exact answer.
- exact
- An exact algorithm is guaranteed to give the correct (e.g. minimal) answer.
- An exact match between strings, without errors.
- optimal
- exact, guaranteed correct.
- optimal performance/complexity: as fast as (theoretically) possible
- complexity
- (asymptotic) runtime complexity: how the runtime of an algorithm scales with input size, as in \(O(n^2)\).
- not simple, but difficult: a more informal statement.
- vertex / node / state
- All are used interchangeably, but have slightly different meanings:
- vertex: The objects of a graph between which the edges go. The most ‘mathematical’/precise of the three.
- node: Same as vertex, but can additionally mean a node in a data structure or a node in a tree. More general and hence slightly less precise when vertex could also be used.
- state: Usually in relation to a state machine.
In A*PA, we use state exclusively for a vertex of the edit-graph, and use node instead of vertex, mostly because it’s shorter.
- seed
- An arbitrary $k$mer, as in MinHash and spaced $k$mer methods.
- A substring of \(A\) when doing pairwise alignment.
- As in seed-and-extend, a match: a substring of \(A\) that matches somewhere to a substring of \(B\).
- significant
- a significant result: the result passed a statistical test
- a significant speedup: (informal) much faster
- average / expected / mean
- mean: a statistic of a set of real numbers.
- average: the mean, or ‘common case’ of some sample.
- expected: the expected value of a random variable.
TODO
- Terms related to reads:
- ONT
- chemistry
- base caller
- Terms related to Conda:
- conda
- anaconda
- miniconda
- forge
- mambaforge?
- miniforge?
- in silico
- de novo
- seed / match
References
Navarro, Gonzalo. 2001. “A Guided Tour to Approximate String Matching.” Acm Computing Surveys 33 (1): 31–88. https://doi.org/10.1145/375360.375365.