Difference between revisions of "Jointly typical sequences"

Revision as of 08:59, 20 April 2021

Definition

Let $(x^{n},y^{n})$ be an $n$ -sequence of pairs drawn from the set ${\mathcal {X}}\times {\mathcal {Y}}$ . For a fixed $\epsilon >0$ , $(x^{n},y^{n})$ is said to be jointly typical with respect to random variables $X$ and $Y$ if the three conditions below are all satisfied:

$|-{\frac {1}{n}}\log p(x^{n})-H(X)|<\epsilon$
$|-{\frac {1}{n}}\log p(y^{n})-H(Y)|<\epsilon$
$|-{\frac {1}{n}}\log p(x^{n},y^{n})-H(X,Y)|<\epsilon$

From the definition, we see that the notion of joint typicality is stronger than the previous formulation. If we only needed the sequence of pairs $(x_{i},y_{i})$ for $i=1,2,\cdots ,n$ , we could simply invoke the Asymptotic Equipartition Property with the random variable $X$ replaced by $(X,Y)$ . The first two properties further constrain the component sequences $(x_{1},x_{2},\cdots ,x_{n})$ and $(y_{1},y_{2},...,y_{n})$ beyond the earlier requirement of AEP. Altogether, these three properties guarantee us that sufficiently long sequences behave near their true joint and marginal distributions with very high probability.

Example: binary symmetric channel

For our example, we consider the sequence of input-output pairs $(x_{i},y_{i})$ of a binary symmetric channel (BSC). A BSC takes in inputs from the input alphabet ${\mathcal {X}}=\{0,1\}$ and produces outputs which also belong to the same alphabet ${\mathcal {Y}}=\{0,1\}$ . If we write the joint distribution as $p(X,Y)=p(X)p(Y|X)$ , we can generate the pairs by first generating a value $x_{i}$ using $p(X)$ , and then drawing a value $y_{i}$ using the conditional distribution $p(Y|X=x_{i})$ . For the BSC, $p(Y|X)=p_{e}$ if $Y\neq X$ and $p(Y|X)=1-p_{e}$ if $Y=X$ , where $0\leq p_{e}\leq 1/2$ is called the crossover probability.

In our calculation, the input $X$ is assumed to be distributed such that $P(X=1)=2/3$ , and the crossover probability is set to $p_{e}=1/10$ . The typicality threshold is set at $\epsilon =0.01$ . From $p(X)$ and $p(Y|X)$ , we obtain the following joint distribution:

	$Y=0$	$Y=1$
$X=0$	3/10	1/30
$X=1$	1/15	3/5

The table below shows the probability of producing a jointly typical sequence as the blocklength increases.

$n$	$\mathbb {P} \left(X^{n}\in A_{\epsilon }^{(n)}\right)$
50	0
100	0.003050
500	0.155243
1000	0.317690
5000	0.703099
10000	0.818271
50000	0.982649

To produce the values in the table, we enumerate the relatively frequencies of pairs that will result in a typical sequence for a fixed $n$ . In our example, you can think of this a "profile" consisting of four numbers, $(k_{00},k_{01},k_{10},k_{11})$ , where $k_{ij}$ is the number of $(i,j)$ pairs in the sequence for $i,j\in \{0,1\}$ . At $n=100$ , there is only one such profile: $(k_{00},k_{01},k_{10},k_{11})=(30,3,7,60)$ . Using the multinomial theorem, we can calculate the probability of this profile as

${\frac {100!}{30!3!7!60!}}(3/10)^{30}(1/30)^{3}(1/15)^{7}(3/5)^{60}\approx 0.003050$

For larger $n$ , the number of profiles increases and concentrates the total probability weight near the underlying joint distribution $p(X,Y)$ .

@@ Line 1: / Line 1: @@
-== Definition ==
-== Illustration ==
-== Example: Binary symmetric channel ==
+==Definition==
+Let <math>(x^n,y^n)</math> be an <math>n</math>-sequence of pairs drawn from the set <math>\mathcal{X} \times \mathcal{Y}</math>. For a fixed <math>\epsilon > 0</math>, <math>(x^n, y^n)</math> is said to be jointly typical with respect to random variables <math>X</math> and <math>Y</math> if the three conditions below are all satisfied:
+* <math>|-\frac{1}{n} \log p(x^n) - H(X)| < \epsilon </math>
+* <math>|-\frac{1}{n} \log p(y^n) - H(Y)| < \epsilon </math>
+* <math>|-\frac{1}{n} \log p(x^n,y^n) - H(X,Y)| < \epsilon </math>
+From the definition, we see that the notion of joint typicality is stronger than the previous formulation. If we only needed the sequence of pairs <math>(x_i, y_i)</math> for <math> i = 1, 2, \cdots, n</math>, we could simply invoke the Asymptotic Equipartition Property with the random variable <math>X</math> replaced by <math>(X,Y)</math>. The first two properties further constrain the component sequences <math>(x_1, x_2, \cdots, x_n)</math> and <math>(y_1, y_2, ..., y_n)</math> beyond the earlier requirement of AEP. Altogether, these three properties guarantee us that sufficiently long sequences behave near their true joint and marginal distributions with very high probability.
+== Example: binary symmetric channel ==
+For our example, we consider the sequence of input-output pairs <math>(x_i, y_i)</math> of a binary symmetric channel (BSC). A BSC takes in inputs from the input alphabet <math>\mathcal{X} = \{0,1\}</math> and produces outputs which also belong to the same alphabet <math>\mathcal{Y} = \{0, 1\}</math>. If we write the joint distribution as <math>p(X,Y) = p(X) p(Y|X)</math>, we can generate the pairs by first generating a value <math>x_i</math> using <math>p(X)</math>, and then drawing a value <math>y_i</math> using the conditional distribution <math>p(Y|X=x_i)</math>. For the BSC, <math>p(Y|X) = p_e</math> if <math>Y\ne X</math> and <math>p(Y|X) = 1 - p_e </math> if <math>Y=X</math>, where <math>0 \le p_e \le 1/2</math> is called the crossover probability.
+In our calculation, the input <math>X</math> is assumed to be distributed such that <math>P(X=1) = 2/3</math>, and the crossover probability is set to <math>p_e = 1/10</math>. The typicality threshold is set at <math>\epsilon = 0.01</math>. From <math>p(X)</math> and <math>p(Y|X)</math>, we obtain the following joint distribution:
+{| class="wikitable"
+|-
+! scope="col"|
+! scope="col"| <math>Y=0</math>
+! scope="col"| <math>Y=1</math>
+|-
+! scope="row"| <math>X=0</math>
+| 3/10
+| 1/30
+|-
+! scope="row"| <math>X=1</math>
+| 1/15
+| 3/5
+|-
+|}
+The table below shows the probability of producing a jointly typical sequence as the blocklength increases.
+{| class="wikitable"
+|-
+! scope="col"| <math>n</math>
+! scope="col"| <math>\mathbb{P}\left(X^n \in A_\epsilon^{(n)}\right)</math>
+|-
+| 50
+| 0
+|-
+| 100
+| 0.003050
+|-
+| 500
+| 0.155243
+|-
+| 1000
+| 0.317690
+|-
+| 5000
+| 0.703099
+|-
+| 10000
+| 0.818271
+|-
+| 50000
+| 0.982649
+|-
+|}
+To produce the values in the table, we enumerate the relatively frequencies of pairs that will result in a typical sequence for a fixed <math>n</math>. In our example, you can think of this a "profile" consisting of four numbers, <math>(k_{00}, k_{01}, k_{10}, k_{11})</math>, where <math>k_{ij}</math> is the number of <math>(i,j)</math> pairs in the sequence for <math>i,j\in\{0,1\}</math>. At <math>n=100</math>, there is only one such profile: <math>(k_{00}, k_{01}, k_{10}, k_{11}) = (30, 3, 7, 60)</math>. Using the multinomial theorem, we can calculate the probability of this profile as
+<math>\frac{100!}{30!3!7!60!}(3/10)^{30} (1/30)^3 (1/15)^7 (3/5)^{60} \approx 0.003050</math>
+For larger <math>n</math>, the number of profiles increases and concentrates the total probability weight near the underlying joint distribution <math>p(X,Y)</math>.

Difference between revisions of "Jointly typical sequences"

Revision as of 08:59, 20 April 2021

Definition

Example: binary symmetric channel

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools