Definitions
Entropy
- a measure of the uncertainty of a random variable
- The entropy of a random variable is a measure of the uncertainty of the random variable
- it is a measure of the amount of information required on the average to describe the random variable
Relative Entropy
- a measure of the distance between two distributions
- a measure of the inefficiency of assuming that the distribution is
when the true distribution is
.
Mutual Information
- a measure of the amount of information that one random variable contains about another random variable
Entropy
Definitions:
- Shannon entropy
- a measure of the uncertainty of a random variable
- The entropy of a random variable is a measure of the uncertainty of the random variable
- it is a measure of the amount of information required on the average to describe the random variable
Desired Properties[1]
- Uniform distributions have maximum uncertainty.
- Uncertainty is additive for independent events.
- Adding an outcome with zero probability has no effect.
- The measure of uncertainty is continuous in all its arguments.
- Uniform distributions with more outcomes have more uncertainty.
- Events have non-negative uncertainty.
- Events with a certain outcome have zero uncertainty.
- Flipping the arguments has no effect.
Formulation
The entropy of a discrete random variable,
, is
-

|
|
(1)
|
where
has a probability mass function (pmf),
, and an alphabet
.
Expected Value
For a discrete random variable,
, with probability mass function,
, the expected value of
is
-
![{\displaystyle E\left[X\right]=\sum _{x\in {\mathcal {X}}}x\cdot p\left(x\right)}](https://en.wikipedia.org/api/rest_v1/media/math/render/svg/d9dbce411a975920f24e1a51c51de7c05744504e)
|
|
(2)
|
For a discrete random variable,
, with probability mass function,
, the expected value of
is
-
![{\displaystyle E\left[g\left(X\right)\right]=\sum _{x\in {\mathcal {X}}}g\left(x\right)\cdot p\left(x\right)}](https://en.wikipedia.org/api/rest_v1/media/math/render/svg/0f6c3ff9d376db150d18175fc137833caff8685a)
|
|
(3)
|
Consider the case where
. We get
-
![{\displaystyle E\left[\log _{2}\left({\tfrac {1}{p\left(x\right)}}\right)\right]=\sum _{x\in {\mathcal {X}}}\log _{2}\left({\tfrac {1}{p\left(x\right)}}\right)\cdot p\left(x\right)=-\sum _{x\in {\mathcal {X}}}p\left(x\right)\cdot \log _{2}p\left(x\right)=H\left(X\right)}](https://en.wikipedia.org/api/rest_v1/media/math/render/svg/712940ca1eb3df491029f6ba791416a310fa306e)
|
|
(4)
|
Lemma 1: Entropy is greater than or equal to zero
-

|
|
(5)
|
Proof: Since
, then
, and subsequently,
. Thus from Eq. (4) we get
.
Lemma 2: Changing the logarithm base
-

|
|
(6)
|
Proof:
- Given that


- And since

- We get

Note that the entropy,
, has units of bits for
, or nats (natural units) for
, or dits (decimal digits) for
.
Joint Entropy
Definition:
- a measure of the uncertainty associated with a set of variables
The joint entropy of a pair of discrete random variables
with joint pmf
is defined as
-

|
|
(7)
|
References