Difference between revisions of "Channel coding theorem"
Adrian Vidal (talk | contribs) |
Adrian Vidal (talk | contribs) |
||
Line 1: | Line 1: | ||
== Channel capacity == | == Channel capacity == | ||
+ | === Motivation === | ||
+ | Formally, a channel is defined as a stochastic system which takes an input random variable <math>X</math> and an output random variable <math>Y</math> with a specified conditional probability distribution, <math>p(Y|X)</math>. Ideally, in a data transmission or storage systems, channels are noiseless which means that <math>X</math> is completely determined by <math>Y</math>, which implies that <math>H(X|Y) = 0</math>. However, most channels will exhibit some level of randomness and potentially one value <math>Y=y</math> may come from more than one value of <math>X</math>. Thus, it is desired that the mutual information between <math>X</math> and <math>Y</math> is maximized. | ||
+ | |||
+ | === Example === | ||
+ | Consider the following ternary-input ternary-output channel with input and output alphabet <math>\mathcal{X} = \mathcal{Y} = \{0, 1, 2\}</math> with the following conditional probability <math>p(Y|X)</math>: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | | | ||
+ | | <math>Y=0</math> | ||
+ | | <math>Y=1</math> | ||
+ | | <math>Y=2</math> | ||
+ | |- | ||
+ | | <math>X = 0 </math> | ||
+ | | 0.9 | ||
+ | | 0.1 | ||
+ | | 0 | ||
+ | |- | ||
+ | | <math>X = 1</math> | ||
+ | | 0.1 | ||
+ | | 0.9 | ||
+ | | 0 | ||
+ | |- | ||
+ | | <math>X = 2</math> | ||
+ | | 0 | ||
+ | | 0 | ||
+ | | 1 | ||
+ | |- | ||
+ | |} | ||
+ | |||
+ | Here, it is clear that <math>Y=0</math> may have been due to <math>X=0</math> or <math>X = 1</math>. Although we do not have any control over this channel (we cannnot change <math>p(Y|X)</math>), we do have control over <math>p(X)</math>, the input distribution. Our goal then is to select an input distribution (transmission strategy) so that the mutual information between the inputs and outputs is maximized. The maximum mutual information obtained in such a manner is defined as the channel capacity. For discrete memoryless channels, we define the channel capacity <math>C</math> as | ||
+ | |||
+ | <math>C = \max_{p(X)} I(X;Y) </math>. | ||
+ | |||
+ | In our example, our transmission strategy is to assign probabilities to each input symbol <math>x \in \{0, 1, 2\}</math> with complete knowledge of the channel <math>p(Y|X)</math>. (If we do not know <math>p(Y|X)</math>, we need to do some form of ''channel estimation'', which out of our current scope in CoE 161.) | ||
+ | |||
+ | Let us say that this is your first time to do capacity calculations. You see the definition and think that maybe you can guess the right input distribution. After some thought, you came up with three candidate solutions for <math>p_X</math>: | ||
+ | |||
+ | * Always transmit <math>X=2</math>. | ||
+ | * Transmit <math>X=0</math> or <math>X=1</math> with 50-50 probability. | ||
+ | * Transmit all symbols with equal probability, <math>p(x) = 1/3</math> for all <math>x</math>. | ||
+ | |||
+ | Checkpoint: Without calculating <math>I(X;Y)</math>, argue that the first two transmission schemes are obviously suboptimal. | ||
== Channel coding theorem == | == Channel coding theorem == |
Revision as of 11:14, 22 April 2021
Contents
Channel capacity
Motivation
Formally, a channel is defined as a stochastic system which takes an input random variable and an output random variable with a specified conditional probability distribution, . Ideally, in a data transmission or storage systems, channels are noiseless which means that is completely determined by , which implies that . However, most channels will exhibit some level of randomness and potentially one value may come from more than one value of . Thus, it is desired that the mutual information between and is maximized.
Example
Consider the following ternary-input ternary-output channel with input and output alphabet with the following conditional probability :
0.9 | 0.1 | 0 | |
0.1 | 0.9 | 0 | |
0 | 0 | 1 |
Here, it is clear that may have been due to or . Although we do not have any control over this channel (we cannnot change ), we do have control over , the input distribution. Our goal then is to select an input distribution (transmission strategy) so that the mutual information between the inputs and outputs is maximized. The maximum mutual information obtained in such a manner is defined as the channel capacity. For discrete memoryless channels, we define the channel capacity as
.
In our example, our transmission strategy is to assign probabilities to each input symbol with complete knowledge of the channel . (If we do not know , we need to do some form of channel estimation, which out of our current scope in CoE 161.)
Let us say that this is your first time to do capacity calculations. You see the definition and think that maybe you can guess the right input distribution. After some thought, you came up with three candidate solutions for :
- Always transmit .
- Transmit or with 50-50 probability.
- Transmit all symbols with equal probability, for all .
Checkpoint: Without calculating , argue that the first two transmission schemes are obviously suboptimal.