# Neighbor joining

Neighbor joining (NJ, paper) is a phylogeny reconstruction method. It differs from UPGMA in the way it computes the distances between clusters.

This algorithm first assumes that the phylogeny is a star graph. Then it finds the pair of vertices that when merged and split out gives the minimal total edge length $$S_{ij}$$ of the new almost-star graph. (See eq. (4) and figure 2a and 2b in the paper.) $S_{i,j} = \frac1{2(n-2)} \sum_{k\not\in \{i,j\}}(d(i, k)+d(j,k)) + \frac 12 d(i,j)+\frac 1{n-2} \sum_{k<l,\, k, l\not\in\{i,j\}}d(k,l).$ After subtracting the sum of all pairwise distances (which is a constant) and multiplying by $$2(n-2)$$, we obtain the familiar $Q(i, j) = (n-2) d(i, j) - \sum_{k=1}^n d(i, k) - \sum_{k=1}^n d(j, k).$ Thus, we merge the two vertices that minimize $$Q$$. The distance from the merging of vertices $$i$$ and $$j$$ to each other vertex $$k$$ is $$d_{(i-j)k} = (d_{i,k} + d_{j,k})/2$$.

NJ also estimates branch lengths. See the paper for details.

Input
Matrix of pairwise distances
Output
Phylogeny
Algorithm
Repeatedly merge the nearest two clusters using metric $$Q$$ defined above.
Complexity
$$O(n^3)$$. Faster is possible using heuristics.