Glossary

This is a growing list of ambiguous terms and their definitions. More of a place to store random remarks than a complete reference for now.

diagonal transition
name introduced by Navarro (2001)
approximate
  1. approximate algorithm: an algorithms that does not always give the correct answer.

  2. $k$-approximate string matching: variant semi-global alignment where we find all matches of a pattern in a reference with at most \(k\) mistakes.

    Also approximate string matching: alternative name for global pairwise alignment.

dynamic heuristic
A heuristic that changes as the A* progresses. Not: online heuristic.
heuristic
  1. A* heuristic: A function \(h\) that provides a lower bound (when admissible) on the distance from the current state to the end.
  2. heuristic method: An approximate method, that approximates the exact answer.
exact
  1. An exact algorithm is guaranteed to give the correct (e.g. minimal) answer.
  2. An exact match between strings, without errors.
optimal
  1. exact, guaranteed correct.
  2. optimal performance/complexity: as fast as (theoretically) possible
complexity
  1. (asymptotic) runtime complexity: how the runtime of an algorithm scales with input size, as in \(O(n^2)\).
  2. not simple, but difficult: a more informal statement.
vertex / node / state
All are used interchangeably, but have slightly different meanings:
  • vertex: The objects of a graph between which the edges go. The most ‘mathematical’/precise of the three.
  • node: Same as vertex, but can additionally mean a node in a data structure or a node in a tree. More general and hence slightly less precise when vertex could also be used.
  • state: Usually in relation to a state machine.

In A*PA, we use state exclusively for a vertex of the edit-graph, and use node instead of vertex, mostly because it’s shorter.

seed
  1. An arbitrary $k$mer, as in MinHash and spaced $k$mer methods.
  2. A substring of \(A\) when doing pairwise alignment.
  3. As in seed-and-extend, a match: a substring of \(A\) that matches somewhere to a substring of \(B\).
significant
  1. a significant result: the result passed a statistical test
  2. a significant speedup: (informal) much faster
average / expected / mean
  • mean: a statistic of a set of real numbers.
  • average: the mean, or ‘common case’ of some sample.
  • expected: the expected value of a random variable.

TODO

  • Terms related to reads:
    • ONT
    • chemistry
    • base caller
  • Terms related to Conda:
    • conda
    • anaconda
    • miniconda
    • forge
    • mambaforge?
    • miniforge?
  • in silico
  • de novo
  • seed / match

References

Navarro, Gonzalo. 2001. “A Guided Tour to Approximate String Matching.” Acm Computing Surveys 33 (1): 31–88. https://doi.org/10.1145/375360.375365.