Uniquely Decodable Codes
Contents
Special Case: Prefix-Free Codes
The Kraft-McMillan Inequality
The Kraft-McMillan inequality provides (1) a necessary condition for unique decodability, and (2) a sufficient condition for the existence of uniquely decodable codes.
- (Necessity.) Let be a -ary uniquely decodable code whose codewords have lengths . Then,
- (Sufficiency.) Let be a sequence of positive integers such that . Then, there exists a uniquely decodable code whose codewords have lengths .
Since all instantaneous/prefix-free codes are uniquely decodable, then the above statements are also true if we replace "uniquely decodable" by "instantaneous." Let us first prove the statements for prefix-free codes, and then we discuss the more general case of uniquely decodable codes.
Proof for Prefix-Free Codes
Prefix-free codes can be visualized as a -ary tree, in which all non-leaf nodes have at most children. In the figure below, we have a binary (2-ary) prefix-free code where each "left turn" corresponds to 0
and each "right turn" corresponds to 1
.
Since no codeword can be a prefix of another codeword, the following (equivalent) properties must be satisfied:
- no codeword can be a descendant of another codeword, and
- no two codewords can share a descendant in the full binary tree.
The first property is a direct consequence of being prefix-free. Essentially, it tells us that once we have assigned a node in the tree as a codeword, all of its descendants in the full binary tree are denied the possibility of being a codeword. The figure illustrates this property using the dashed circles below solid black nodes. The second property requires a bit more thought: if a tree node has codewords A and B are ancestors, then either A is a descendant of B or B is a descendant of A. The proof of this fact is left as an exercise.
We are now ready to prove the necessity condition for prefix-free codes. Let be the maximum codeword length in . Construct a full binary tree with depth . At the lowest level, there are leaf nodes. For example, in the figure we have and , which corresponds to leaf nodes.
Each codeword will have some descendants at the lowest level. In the figure, the codeword 01
is at level 2 and it has 4 descendants at the lowest level. In general, a D-ary codeword at level will have descendants at the lowest level, . (If the codeword is itself at the lowest level, then we count that codeword as its own descendant.) Now, since no two codewords can share a descendant, we can obtain the total number of distinct descendants by simply adding them up. The resulting total must be at most equal to number of leaf nodes, which is gives us Kraft's inequality.
Proof for Uniquely Decodable Codes
The proof of the Kraft-McMillan Inequality for uniquely decodable codes is interesting since it starts with evaluating :
-
(2)
-
Let . Thus, the minimum value of is , when all the codewords are 1 bit long, and the maximum is , when all the codewords have the maximum length. We can then write:
-
(3)
-
Where is the number of combinations of codewords that have a combined length of . Note that the number of distinct codewords of length is . If this code is uniquely decodable, then each sequence can represent one and only one sequence of codewords. Therefore, the number of possible combinations of codewords whose combined length is cannot be greater than , or:
-
(4)
-
We can then write:
-
(5)
-
Thus, we can conclude that since if this were not true, would exceed for large .