Tricks with match bonus or how to fool Dijkstra’s limitations Link to heading
The reader is assumed to have basic knowledge about pairwise alignment and graph theory.
In this post we will show how to transform traditional alignment score models that include a match
bonus into an equivalent cost model using only non-negative costs. We generalize the relationship
in Eizenga and Paten (2022) and show that it applies to both affine and asymmetric cost
models. Unlike their method, our method does not increase the base penalties by a factor
Tl;dr: Negative match costs can be transformed into an equivalent non-negative cost model. This allows WFA to solve such instances of global alignment, and may provide up to 2x speedup over an earlier transformation.
Edit graph Link to heading
Figure 1: The edit graph with match cost (M=0), mismatch cost (X), insertion cost (I), and deletion cost (D).
The default global pairwise alignment task can be considered as the task of finding a shortest path in the edit graph, as shown in Figure 1.
Edges in this graph correspond to the cost you need to ‘pay’ for an insertion
(cost
Figure 2: Edit graph with negative match cost (M=-B).
In some cases, a negative match cost
Algorithms Link to heading
Let’s briefly recall what algorithms can be used to compute the shortest path in an
edit graph. (Note:
- Needleman-Wunsch:
approach based on dynamic programming that works for any cost model. - Exponential band:
by doubling the searched band on each iteration. Only for non-negative costs. - Dijkstra:
when used with a bucket-queue. Only works for non-negative costs. - WFA: A more efficient variant of Dijkstra,
worst case and expected case. Only works for non-negative edges.
As you can see, only the Needleman-Wunsch DP can handle the negative cost edges
created by introducing a match bonus.11 That’s because NW visits all vertices
in topologically sorted order, and hence considers all possible path to each
vertex.
Potentials Link to heading
A classic variant of Dijkstra’s algorithm that can handle negative-cost edges is
Johnson’s algorithm (Johnson 1977). The idea is to give a potential Note
that most sources use
It turns out
In our case, the edit graph is weighted, directed, without cycles, and “all roads lead to Rome” in it – any way you can take in the edit graph will end in the target point to which we are finding a shortest path. To get to this terminus we need to cross the graph both vertically and horizontally as we go from the upper left corner to the bottom right one. All paths take the same number of steps down, and the same number of steps to the right, where an insertion is one step down, a deletion is one step right, and a match and mismatch are counted as one step down and one step right.
Figure 3: Edit graph with transformed non-negative edges with (Delta_D + Delta_I = B), so that (M’=0).
We can increase the cost of each horizontal step by
Now, a simple way to get rid of negative match edges is to simply require that
Figure 4: Coordinate grid for states/vertices.
Let
Indeed, the cost of a deletion goes up by
We would like all edges to have a non-negative cost, so our choice of
Multiple variants Link to heading
The above equations give us some flexibility in choosing
There are a few natural choices of
Type | ||||||
---|---|---|---|---|---|---|
Symmetric | ||||||
Expensive deletions | ||||||
Expensive insertions | ||||||
Free deletions | ||||||
Free insertions |
The symmetric option in the first row is the most natural choice, and roughly
corresponds to the transformation suggested in Eizenga and Paten (2022). It differs in
that all costs are divided by
The bottom two rows are even applicable when matches are already free (
Some notes on algorithms Link to heading
WFA Link to heading
For WFA this cost transformation is a life-saver because with
Thus, we expect that the lower the potential of the target state, the faster the WFA algorithm runs.
A* Link to heading
The A* algorithm by itself can not handle negative costs. However, a heuristic
function
This works in our case because it ensures that
Extending to different cost models Link to heading
Affine costs Link to heading
The potentials defined above naturally extend to affine costs. Each state in an affine layer
naturally corresponds to a state
This means that the delete-extend cost increases with
Substitution matrices Link to heading

Figure 5: The BLOSUM matrix. CC BY-SA 4.0 via Wikipedia.
The BLOSUM matrix (Figure 5) specifies a match score for each pair of amino acids, with some entries being positive (indicating similarity) and some being negative, so we are maximizing the score. This can be transformed into a cost model by simply negating all scores, which allows us the previous techniques.
In general, let I’m using
But not local alignment Link to heading
We will stress here that this idea does not work for local alignment.
The reason it works for global alignment is that the cost of each path is
increased by the same amount. Two local alignments can have completely different
lengths and thus span a different number of rows/columns of the table. That
means that the cost increase
Evaluations Link to heading
Now let’s see how WFA performs when using different transformation variants.
We will use the following cost model:
- insertion cost - deletion cost - substitution cost - match bonus
Unequal string length Link to heading
Firstly, let’s make an experiment on sequences where the first sequence is
longer than the second, i.e.
Given the match bonus, we can take different values for The author’s laptop was made in
times of the Second Punic War, so he decided not to take very long sequences to
save his working station. Also, the code is not optimized in the first
place. Look at relative timings only.
[TODO: Also compare to the transformation in Eizenga and Paten (2022).]
Cost | Time (ms) | #expanded | ||||
---|---|---|---|---|---|---|
260 | 199 | -275 | 26 | 92628 | ||
260 | 199 | -275 | 11 | 48895 | ||
260 | 199 | -275 | 6 | 42610 | ||
260 | 199 | -275 | 26 | 97477 |
Note: the last row in the table depictes experiment on cost model with match bonus introduced in Eizenga and Paten (2022), so
The numbers are eloquent enough, so let’s look at the pixels. As you can see the second sequence has a suffix of the first one deleted.
![]() | ![]() | ![]() |
We can clearly see that increasing
Equal string lengths Link to heading
Let’s see how things work out in case of equal length strings, where the total
score will be the same independent of how we choose
N | M | Cost | Time (ms) | #expanded | ||
---|---|---|---|---|---|---|
200 | 200 | -323 | 4 | 45920 | ||
200 | 200 | -323 | 3 | 34466 | ||
200 | 200 | -323 | 4 | 46020 | ||
200 | 200 | -323 | 6 | 68669 |
Note: the last row in the table depictes experiment on cost model with match bonus introduced in Eizenga and Paten (2022), so
In this case, we can see in the experiments that it is preferred to split the
match bonus equally between
![]() | ![]() | ![]() |
The middle picture clearly visits fewer states, and seems to benefit from being pushed towards the end, without needless exploration to the sides.
Conclusion Link to heading
We have shown that alignment score/cost models that include a match bonus can be transformed into an equivalent cost model using only non-negative costs when doing global alignment. This transformation even works for affine and asymmetric costs. It avoids doubling costs (to preserve integral values) by distributing an odd match bonus unevenly over the horizontal and vertical steps in the edit graph.
To summarize: when the cost model is given by match cost
References Link to heading
That’s because NW visits all vertices in topologically sorted order, and hence considers all possible path to each vertex. ↩︎
Note that most sources use
with an opposite sign. We find our choice to be more natural though: think of the potential as potential energy (height). When going up, you pay extra for this, and can use this energy/cost reduction later when going down. ↩︎I’m using
instead of to indicate that this is a score instead of a cost. ↩︎The author’s laptop was made in times of the Second Punic War, so he decided not to take very long sequences to save his working station. ↩︎
Also, the code is not optimized in the first place. Look at relative timings only. ↩︎