Jointly typical sequences

In this article, we discuss an extension of typical sets to jointly distributed random variables.

Definition

Let $(x^{n},y^{n})$ be an $n$ -sequence of pairs drawn from the set ${\mathcal {X}}\times {\mathcal {Y}}$ . For a fixed $\epsilon >0$ , $(x^{n},y^{n})$ is said to be jointly typical with respect to random variables $X$ and $Y$ if the three conditions below are all satisfied:

$|-{\frac {1}{n}}\log p(x^{n})-H(X)|<\epsilon$
$|-{\frac {1}{n}}\log p(y^{n})-H(Y)|<\epsilon$
$|-{\frac {1}{n}}\log p(x^{n},y^{n})-H(X,Y)|<\epsilon$

From the definition, we see that the notion of joint typicality is stronger than the previous formulation. If we only needed the sequence of pairs $(x_{i},y_{i})$ for $i=1,2,\cdots ,n$ , we could simply invoke the Asymptotic Equipartition Property with the random variable $X$ replaced by $(X,Y)$ . The first two properties further constrain the component sequences $(x_{1},x_{2},\cdots ,x_{n})$ and $(y_{1},y_{2},...,y_{n})$ beyond the earlier requirement of AEP. Altogether, these three properties guarantee us that sufficiently long sequences behave near their true joint and marginal distributions with very high probability.

Example: binary symmetric channel

For our example, we consider the sequence of input-output pairs $(x_{i},y_{i})$ of a binary symmetric channel (BSC). A BSC takes in inputs from the input alphabet ${\mathcal {X}}=\{0,1\}$ and produces outputs which also belong to the same alphabet ${\mathcal {Y}}=\{0,1\}$ . If we write the joint distribution as $p(X,Y)=p(X)p(Y|X)$ , we can generate the pairs by first generating a value $x_{i}$ using $p(X)$ , and then drawing a value $y_{i}$ using the conditional distribution $p(Y|X=x_{i})$ . For the BSC, $p(Y|X)=p_{e}$ if $Y\neq X$ and $p(Y|X)=1-p_{e}$ if $Y=X$ , where $0\leq p_{e}\leq 1/2$ is called the crossover probability.

In our calculation, the input $X$ is assumed to be distributed such that $P(X=1)=2/3$ , and the crossover probability is set to $p_{e}=1/10$ . The typicality threshold is set at $\epsilon =0.01$ . From $p(X)$ and $p(Y|X)$ , we obtain the following joint distribution:

	$Y=0$	$Y=1$
$X=0$	3/10	1/30
$X=1$	1/15	3/5

The table below shows the probability of producing a jointly typical sequence as the blocklength increases.

$n$	$\mathbb {P} \left(X^{n}\in A_{\epsilon }^{(n)}\right)$
50	0
100	0.003050
500	0.155243
1000	0.317690
5000	0.703099
10000	0.818271
50000	0.982649

To produce the values in the table, we enumerate the relatively frequencies of pairs that will result in a typical sequence for a fixed $n$ . In our example, you can think of this a "profile" consisting of four numbers, $(k_{00},k_{01},k_{10},k_{11})$ , where $k_{ij}$ is the number of $(i,j)$ pairs in the sequence for $i,j\in \{0,1\}$ . At $n=100$ , there is only one such profile: $(k_{00},k_{01},k_{10},k_{11})=(30,3,7,60)$ . Using the multinomial theorem, we can calculate the probability of this profile as