This note summarizes some papers I was reading while investigating the history
of A* for pairwise alignment, and related to that the first usage of a *bucket
queue*. Schrijver (2012) provides a nice overview of general shortest path methods.

## Shortest path algorithms ..

### .. in general

Moore (1959) was already presented in 1957. I did not find a PDF of this paper but Schrijver (2012) summarizes it well: For unit-cost graphs it presents an \(O(m)\) BFS algorithm, and for general weighted graphs an \(O(mn)\) algorithm.

Dijkstra (1959) introduces the classic \(O(n^2)\) algorithm of repeatedly

*expanding*the*open*vertex with shortest distance from the start (using more modern terminology), and recommends storing the shortest distance to each node together with the node.This was later improved to \(O(n\lg n)\) using e.g. a priority queue.

### .. for circuit design

- Lee (1961) introduces an algorithm for layout our electrical wires on a
chip. It is very similar to Dijkstra’s classical algorithm (Dijkstra 1959),
but stores distances together with the queue/set of open states, rather than a
separate dictionary. It’s also slightly more general by using a
*vector*of cost functions \((f_1, \dots, f_r)\), and allowing path-dependent costs. - Hart, Nilsson, and Raphael (1968; Hart, Nilsson, and Raphael 1972) introduces A*.
- Rubin (1974, 1976) builds on Lee (1961), and among others:
- It notes that Dijkstra’s algorithm (storing costs in states) can be used to speed up of Lee’s algorithm.
- It uses A* with Manhattan distance heuristic.
- It uses DFS on states with the same $f$-value.
- Somewhat confusingly, this paper lists Hitchner (1968) in its references, but doesn’t actually mention it in the text.

## Bucket queues

Dial (1969) is listed as the first introduction of a bucket queue on Wikipedia and uses linked lists of nodes with the same distance, again wrapping around at the maximum arc length.

Hitchner (1968) writes about storing open states keyed by distance (aka a

*bucket queue*), and attributes it to Peter S. Loubal:Rather than storing temporary label values in the storage locations for each node number, it stores node numbers in a table in which the row number of the table is the value of the label.

Further, he writes

Also, it is possible to save space by using a “wrap-around” table which has as many rows as the number of units in the longest arc length plus 1.

I tried finding some paper by Loubal that explains it, but unsuccessfully.

Hoel (1976) applies the bucket queue of Hitchner (1968) to Lee (1961):

by storing frontier cells in an array of stacks rather than a single list, costly searching operations can be eliminated without significantly increasing storage requirements.

It uses an array of stacks, again wrapping around at the maximum arc length. It also discusses using linked lists instead of arrays for the stacks.

## Shortest path algorithms by Hadlock

It was only this week that I became aware of Hadlock’s papers, apparently having missed citations to his work in (Spouge 1989, 1991). Sadly, some of his papers are hard to find online.

### Grid graphs

Hadlock (1977) introduces a

*detour*based algorithm that is essentially equivalent to A*. He uses the Manhattan distance as a heuristic, and defines the detour of an edge as the detour cost^{1}\(d(u, v) = f(v) - f(u) = edgecost(u,v) + h(v) - h(u)\). Like normal A*, this is equivalent to using the heuristic as a*potential*and updating edge costs accordingly.On grid graphs, the detour cost of any path is integer and all \(d(u,v)\) are either \(0\) (when moving towards the target) or \(2\) (when moving away). Hence it suffices to do a BFS, starting with vertices with \(0\) detour, than \(2\) detour, and so on. This is easily implemented using two stacks or a double ended queue.

Hadlock (1979) extends the

*Minimum Detour Algorithm*to non-unit costs. As an application, it introduces a heuristic for paths with a minimal number of turns.

### Strings

Hadlock (1988b): This was presented at a conference. I have not found in online, but requested the ETH library to scan the physical proceedings.

According to papers that cite it, this seems to be the first application of A* to the alignment graph. The heuristics used are the

*gap-cost*and a heuristic based on character frequencies.Hadlock (1989): This is a preprint/submitted paper cited by Hadlock (1988a) and Spouge (1989), but I have not been able to find any mention of it elsewhere at all.

Hadlock (1988a) uses different substitution costs \(s(a,b) = |ord(a)-ord(b)|\) on the edit graph, where the cost of substituting a character depends on the distance between the letters. It again presents the minimum detour algorithm, this time with an upper bound \(\tau\) on the cost of paths searched. It then introduces a specific new heuristic based on character frequencies for the new setting of substitution costs.

## Spouge’s computational volumes

- Spouge (1989) introduces
*computational volumes*: a subset of states of the edit graph that is guaranteed to contain all shortest paths. The main observation is that A* is slow because of bookkeeping (both the distance to each explored state and the frontier) and maintaining of a queue. Computing the states of a computational volume can be done much faster, since there is a natural order to expand the vertices (by row, column, or anti-diagonal). - Spouge (1991) provides a detailed algorithm implementation of the computational volume method and benchmark it.

## References

*Communications of the Acm*12 (11): 632–33. https://doi.org/10.1145/363269.363610.

*Numerische Mathematik*1 (1): 269–71. https://doi.org/10.1007/bf01386390.

*Networks*7 (4): 323–34. https://doi.org/10.1002/net.3230070404.

*Information Linkage between Applied Mathematics and Industry*, 421–30. Elsevier.

*Proceedings of the First International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems - Iea/Aie ’88*. https://doi.org/10.1145/55674.55676.

*Congressus Numerantium*61: 263–74.

*Acm Sigart Bulletin*, no. 37 (December): 28–29. https://doi.org/10.1145/1056777.1056779.

*Ieee Transactions on Systems Science and Cybernetics*4 (2): 100–107. https://doi.org/10.1109/tssc.1968.300136.

*Ieee Transactions on Computers*C–25 (1): 19–24. https://doi.org/10.1109/tc.1976.5009200.

*Ieee Transactions on Electronic Computers*EC-10 (3): 346–65. https://doi.org/10.1109/tec.1961.5219222.

*The Shortest Path through a Maze*. Bell Telephone System. Technical Publications. Monograph. Bell Telephone System.

*Ieee Transactions on Computers*C–23 (9): 907–14. https://doi.org/10.1109/t-c.1974.224054.

*Ieee Transactions on Computers*C–25 (2): 208. https://doi.org/10.1109/tc.1976.5009239.

*Optimization Stories*, January, 155–67. https://doi.org/10.4171/dms/6/19.

*Siam Journal on Applied Mathematics*49 (5): 1552–66. https://doi.org/10.1137/0149094.

*Bioinformatics*7 (1): 1–7. https://doi.org/10.1093/bioinformatics/7.1.1.

I’m leaving out a factor \(2\) here. ↩︎