Difference between revisions of "161-A2.1"
Line 65: | Line 65: | ||
For code 1, since there are two symbols sharing an encoded symbol, we can say that it is not a distinct code, and therefore not uniquely decodable. We can check code 2 by taking a sequence of symbols encoded using this code: <math>0011</math>. We can easily see that this message can be decoded in different ways, either as <math>0, 0, 1, 1</math>, or <math>00, 11</math>. | For code 1, since there are two symbols sharing an encoded symbol, we can say that it is not a distinct code, and therefore not uniquely decodable. We can check code 2 by taking a sequence of symbols encoded using this code: <math>0011</math>. We can easily see that this message can be decoded in different ways, either as <math>0, 0, 1, 1</math>, or <math>00, 11</math>. | ||
+ | |||
+ | Code 3 has one of the most desirable properties of a code: having its average length equal to the entropy of the source, and according to Shannon, you cannot have a uniquely decodable code with an average length shorter than this. We can also see that for a sequence of symbols encoded using code 3, there is only one way you can decode the message, thus it is uniquely decodable. | ||
+ | |||
+ | Code 4 is also uniquely decodable, |
Revision as of 20:19, 17 September 2020
- Activity: Source Coding
- Instructions: In this activity, you are tasked to
- Walk through the examples.
- Write a short program to compress and decompress a redundant file.
- Should you have any questions, clarifications, or issues, please contact your instructor as soon as possible.
Uniquely Decodable Codes
Let us try to encode a source with just four symbols in its alphabet, i.e. , with probability distribution . We can calculate the entropy of this source as:
-
(1)
-
Let us look at a few bit sequence assignments and see if they are uniquely decodable or not.
Symbol | Probability | Code 1 | Code 2 | Code 3 | Code 4 | Code 5 |
---|---|---|---|---|---|---|
0.5 | 0 | 0 | 0 | 0 | 00 | |
0.25 | 0 | 1 | 10 | 01 | 01 | |
0.125 | 1 | 00 | 110 | 011 | 10 | |
0.125 | 10 | 11 | 111 | 0111 | 11 | |
Average Code Length: | 1.125 | 1.25 | 1.75 | 1.875 | 2 |
Recall that Shannon's Noiseless Coding Theorem states that . Thus, codes 1 and 2 are not uniquely decodable since they have average code lengths less than .
For code 1, since there are two symbols sharing an encoded symbol, we can say that it is not a distinct code, and therefore not uniquely decodable. We can check code 2 by taking a sequence of symbols encoded using this code: . We can easily see that this message can be decoded in different ways, either as , or .
Code 3 has one of the most desirable properties of a code: having its average length equal to the entropy of the source, and according to Shannon, you cannot have a uniquely decodable code with an average length shorter than this. We can also see that for a sequence of symbols encoded using code 3, there is only one way you can decode the message, thus it is uniquely decodable.
Code 4 is also uniquely decodable,