Difference between revisions of "The Data Processing Inequality"

Revision as of 11:20, 23 October 2020

Markovity

A Markov Chain is a random process that describes a sequence of possible events where the probability of each event depends only on the outcome of the previous event. Thus, we say that $X,Y,Z$ is a Markov chain in this order, denoted as:

X\rightarrow Y\rightarrow Z

(1)

If we can write:

P\left(X=x,Y=y,Z=z\right)=P\left(Z=z\mid Y=y\right)\cdot P\left(Y=y\mid X=x\right)\cdot P\left(X=x\right)

(2)

Or in a more compact form:

P\left(x,y,z\right)=P\left(z\mid y\right)\cdot P\left(y\mid x\right)\cdot P\left(x\right)

(3)

We can use Markov chains to model how a signal is corrupted when passed through noisy channels. For example, if $X$ is a binary signal, it can change with a certain probability, $p$ to $Y$ , and it can again be corrupted to produce $Z$ .

Consider the joint probability $P\left(x,z\mid y\right)$ . We can express this as:

P\left(x,z\mid y\right)={\frac {P\left(x,y,z\right)}{P\left(y\right)}}

(4)

And if $X\rightarrow Y\rightarrow Z$ , we get:

P\left(x,z\mid y\right)={\frac {P\left(z\mid y\right)\cdot P\left(y\mid x\right)\cdot P\left(x\right)}{P\left(y\right)}}

(5)

Since $P\left(y,x\right)=P\left(y\mid x\right)\cdot P\left(x\right)=P\left(x\mid y\right)\cdot P\left(y\right)$ , we can write:

P\left(x,z\mid y\right)={\frac {P\left(z\mid y\right)\cdot P\left(y,x\right)}{P\left(y\right)}}=P\left(z\mid y\right)\cdot P\left(x\mid y\right)

(6)

Thus, we can say that $X$ and $Z$ are conditionally independent given $Y$ . If we think of $X$ as some past event, and $Z$ as some future event, then the past and future events are independent if we know the present event $Y$ . Note that this property is good definition of, as well as a useful tool for checking Markovity.

We can rewrite the joint probability $P\left(x,y,z\right)$ as:

P\left(x,y,z\right)=P\left(z\mid y\right)\cdot P\left(y\mid x\right)\cdot P\left(x\right)={\frac {P\left(z,y\right)}{P\left(y\right)}}\cdot P\left(y,x\right)={\frac {P\left(z,y\right)}{P\left(y\right)}}\cdot P\left(x\mid y\right)\cdot P\left(y\right)=P\left(z,y\right)\cdot P\left(x\mid y\right)

(7)

@@ Line 30: / Line 30: @@
 We can rewrite the joint probability <math>P\left(x, y, z\right)</math> as:
-{{NumBlk|::|<math>P\left(x, y, z\right) = P\left(z\mid y\right)\cdot P\left(y\mid x\right) \cdot P\left(x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(y, x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(x\mid\right)\cdot P\left(y\right)</math>|{{EquationRef|7}}}}
+{{NumBlk|::|<math>P\left(x, y, z\right) = P\left(z\mid y\right)\cdot P\left(y\mid x\right) \cdot P\left(x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(y, x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(x\mid y\right)\cdot P\left(y\right)=P\left(z, y\right)\cdot P\left(x\mid y\right)</math>|{{EquationRef|7}}}}
 == The Data Processing Inequality ==

Difference between revisions of "The Data Processing Inequality"

Revision as of 11:20, 23 October 2020

Contents

Markovity

The Data Processing Inequality

Sufficient Statistics

Fano's Inequality

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools