Difference between revisions of "The Data Processing Inequality"
Line 30: | Line 30: | ||
We can rewrite the joint probability <math>P\left(x, y, z\right)</math> as: | We can rewrite the joint probability <math>P\left(x, y, z\right)</math> as: | ||
− | {{NumBlk|::|<math>P\left(x, y, z\right) = P\left(z\mid y\right)\cdot P\left(y\mid x\right) \cdot P\left(x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(y, x\right)=\frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(x\mid y\right)\cdot P\left(y\right)=P\left(z, y\right)\cdot P\left(x\mid y\right)</math>|{{EquationRef|7}}}} | + | {{NumBlk|::|<math>\begin{align} |
+ | P\left(x, y, z\right) & = P\left(z\mid y\right)\cdot P\left(y\mid x\right) \cdot P\left(x\right)\\ | ||
+ | & = \frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(y, x\right)\\ | ||
+ | & = \frac{P\left(z, y\right)}{P\left(y\right)}\cdot P\left(x\mid y\right)\cdot P\left(y\right)\\ | ||
+ | & = P\left(z, y\right)\cdot P\left(x\mid y\right)\\ | ||
+ | & = P\left(x\mid y\right) \cdot P\left(y\mid z\right) \cdot P\left(z\right)\\ | ||
+ | \end{align}</math>|{{EquationRef|7}}}} | ||
== The Data Processing Inequality == | == The Data Processing Inequality == |
Revision as of 11:23, 23 October 2020
Markovity
A Markov Chain is a random process that describes a sequence of possible events where the probability of each event depends only on the outcome of the previous event. Thus, we say that is a Markov chain in this order, denoted as:
-
(1)
-
If we can write:
-
(2)
-
Or in a more compact form:
-
(3)
-
We can use Markov chains to model how a signal is corrupted when passed through noisy channels. For example, if is a binary signal, it can change with a certain probability, to , and it can again be corrupted to produce .
Consider the joint probability . We can express this as:
-
(4)
-
And if , we get:
-
(5)
-
Since , we can write:
-
(6)
-
Thus, we can say that and are conditionally independent given . If we think of as some past event, and as some future event, then the past and future events are independent if we know the present event . Note that this property is good definition of, as well as a useful tool for checking Markovity.
We can rewrite the joint probability as:
-
(7)
-