Huffman's algorithm is a method for building an extended binary tree with a minimum weighted path length from a set of given weights. Initially construct a forest of singleton trees, one associated with each weight. If there are at least two trees, choose the two trees with the least weight associated with their roots and replace them with a new tree, constructed by creating a root node whose weight is the sum of the weights of the roots of the two trees removed, and setting the two trees just removed as this new node's children. This process is repeated until the forest consists of one tree.
Let us work through an example for the set of weights
. In these intermediate steps we will display the temporary weights assigned by the algorithm in the circled nodes. Initially our forest is
During the first step, the two trees with weights 1 and 2 are merged, to create a new tree with a root of weight 3.
We now have three trees with weights of 3 at their roots. It does not matter which two we choose to merge. Let us choose the tree we just created and one of the singleton nodes of weight 3.
Now our two minimum trees are the two singleton nodes of weights 3 and 4. We will combine these to form a new tree of weight 7.
Finally we merge our last two remaining trees.
The result is an extended binary tree of minimum weighted path length 29. In the following diagram each circled node is marked with its weighted path length.
Each iteration of Huffman's algorithm reduces the size of the problem by 1, and so there are exactly iterations. The th iteration consists of locating the two minimum values in a list of length . This is a linear operation, and so Huffman's algorithm clearly has a time complexity of
.
However, it would be faster to sort the weights initially, and then maintain two lists. The first list consists of weights that have not yet been combined, and the second list consists of trees that have been formed by combining weights. This initial ordering is obtained at a cost of
. Obtaining the minimum two trees at each step then consists of two comparisons (compare the heads of the two lists, and then compare the larger to the item after the smaller). The ordering of the second list can be maintained cheaply by using a binary search to insert new elements. Since at step there are elements in the second list,
comparisons are needed for insertion. Over the entire duration of the algorithm the cost of keeping this list sorted is
. Therefore the overall time complexity of Huffman's algorithm is
.
In terms of space complexity, the algorithm constructs a complete binary tree with exactly leaves. Therefore the output can only have at most nodes. Thus Huffman's algorithm requires linear space.
