Channeling your inner capacity

Channel Basics

When Shannon developed his theory of information, it was in the context of improving communication. His goal was to determine if there was a way to maximize the transmission rates over some noisy channel. In this module, we will learn more about the channel and some formal terminologies that we will use in the succeeding modules. The GIF below recalls the story of Bob and Alice. Bob wants to send a message to Alice across some channel. Let's simplify the scenario. Let's say Bob sends a love letter to Alice through some wireless (satellite based) channel.

Figure 1: Simplified communication model

Figure 2 shows a simplified model of their communication. Bob is the source. Sometimes we use the term sender or transmitter for the source. The medium where the message goes through is the channel (in this case, the wireless channel). Alice is the receiver who probably gets the correct message at the other end of the channel. Let's take a look at each component formally.

The Source

The source is the component that produces messages or objects sent over a channel. We represent the source as a random variable $S$ which contains the outcomes $\{s_{1},s_{2},s_{3},...,s_{n}\}$ and each outcome has its own probability distribution $\{P(s_{1}),P(s_{2}),P(s_{3}),...,P(s_{n})\}$ . The random variable $S$ is also called the source alphabet and the outcomes are the symbols of the source alphabet. A combinational sequence of symbols forms the message that travels through the channel. Consider the following examples:

We can let the source alphabet $S$ be the English alphabet where the symbols are all the letters from a to z including the space. Suppose we want to send the message "the quick brown fox jumps over the lazy dog". All letters of the English alphabet has a probability distribution associated with it. We have seen this from our programming exercise in module 2. We extracted the probability distributions of N-block combinations for English, German, French, and Tagalog languages.

We can let $S$ be the binary alphabet where the symbols are $\{1,0\}$ . The message could be streams of 1s and 0s that could mean something. For example, a message could be a sequence of events when a light is on or off. It could also be an indicator for sending decimal values in the form of binary digits. Finally, it could represent binary pixels of a fixed-size image.

In biology, we can let $S$ be the DNA bases whose symbols are $\{A,C,G,T\}$ . $A$ is adenine, $C$ is cytosine, $G$ is guanine, and $T$ is thymine. These symbols combine to form DNA sequences that are messages to instruct a cell to do certain types of protein synthesis.

Make sure to understand carefully what source alphabets, symbols, and messages mean.

The Channel

The channel is the medium where the source message travels until it gets to the receiver. In our Bob and Alice example, they used a wireless channel. The channel has a maximum capacity measured as the number of symbols that can be transmitted per second. We call this the channel capacity. We will discuss this later. Most of the time, we associate channels with noise. A noiseless channel is where the information of the message gets to the receiver "perfectly". That means whatever the source sends gets to the receiver without any glitches or any manipulation of the message's symbols. A noisy channel flips existing symbols of a message or adds unnecessary (or new) symbols that are not originally part of the source alphabet. The noise disrupts the message and thereby affecting the information that the receiver retrieves. We associate the chance of flipping symbols as conditional probabilities. For example, suppose we received a 1 at the receiver's end but, on a bird's eye view, the source actually sent a 0. We can model this noise with the probability of receiving a 1 given that a 0 was sent (i.e., $P(r=1|s=0)=\epsilon$ ).

Let's take a few examples:

In the Bob and Alice story, the wireless channel can disrupt Bob's message. Suppose one of the messages Bob sends is "love can be as sweet as smelling the roses". Unfortunately, noise can either corrupt symbols or add unwanted symbols such the the message becomes "love cant bed as sweat as smelting the noses". This is a disaster if it gets to Alice. 😱

Another example is memory. Suppose you're in Mars and you need to record audio logs and videos on a solid-state drive (SSD). Memories can be corrupted due to cosmic-ray bit flips. These bit flips occur due to cosmic energies that zap some bits of memory. Say some data $1010$ gets flipped into $0010$ . Both representations mean totally different things.

Last example is social media. Good people (source) would like to spread factual news (message) that are noteworthy of broadcasting to ordinary citizens (receivers). Unfortunately, this does not stop evil citizens from creating fake news (noise). Since both factual news and fake news mix together in social media, fake news disrupts the intended messages for ordinary citizens. Don't you hate it when this happens? 😩 Innocent people fall into this trap.

Of course, if the channels were noiseless, the examples above won't have any issues with the received messages. We won't be dealing with the physics of channels in this course.

The Receiver

The receiver, obviously, receives and accepts the message at the other end of the channel. Just like the source, the receiver is a random variable $R$ that has $\{r_{1},r_{2},r_{3},...,r_{m}\}$ outcomes associated to $\{P(r_{1}),P(r_{2}),P(r_{3}),...,P(r_{m})\}$ probability distribution. We can also call $R$ as the receiver alphabet with symbols $\{r_{1},r_{2},r_{3},...,r_{m}\}$ . Observe that we purposely set the maximum number as $m$ because it is possible that the receiver may receive more symbols than the source alphabet (i.e., $n\leq m$ ) when the channel is noisy. A noiseless channel produces $R=S$ where all outcomes $\{r_{1},r_{2},r_{3},...,r_{n}\}=\{s_{1},s_{2},s_{3},...,s_{n}\}$ and the probability distributions $\{P(r_{1}),P(r_{2}),P(r_{3}),...,P(r_{n})\}=\{P(s_{1}),P(s_{2}),P(s_{3}),...,P(s_{n})\}$ . If the channel is noisy then $R\neq S$ and either the outcomes or the probability distributions are not equal. Take note that it's possible to have the exact same outcomes but different probability distributions. Let's take a look at some examples:

From the Bob and Alice example, when Bob sent "love can be as sweet as smelling the roses" but Alice received "love cant bed as sweat as smelting the noses" shows that some symbols flip and some symbols magically appear. Here, we can show that $n=m$ but the probability distribution of $R$ will be different from $S$ . Checkout the tabulated data below. We'll leave it up to you how we got these numbers. It should be obvious.

Outcomes	$S$	$R$
a	0.071	0.091
b	0.024	0.023
c	0.024	0.023
d	0	0.023
e	0.167	0.136
g	0.024	0.023
h	0.024	0.023
i	0.024	0.023
l	0.071	0.045
m	0.024	0.023
n	0.048	0.068
o	0.048	0.045
r	0.024	0
s	0.143	0.136
t	0.048	0.091
v	0.024	0.023
w	0.024	0.023
space	0.190	0.180

For binary channels it is possible that the receiver alphabet is the same as the source alphabet $R=S$ where $\{0,1\}$ are the symbols. Suppose the source sends a binary image where each pixel is either a 1 or 0 with $P(s=1)=p$ and $P(s=0)=1-p$ . If the channel is noiseless then $P(r=1)=p$ and $P(r=0)=1-p$ and the noise conditional probability $P(r=0|s=1)=P(r=1|s=0)=0$ . However, if the channel is noisy then the noise probabilities will take effect: $P(r=0|s=1)=P(r=1|s=0)=\epsilon$ resulting in $P(r=1)=\epsilon +p-2\epsilon p$ and $P(r=0)=1-\epsilon -p+2\epsilon p$ . We will discuss this later. Deriving these results are part of your theoretical exercise 😁.

Lastly, the race to select a champion to represent the Philippines will be held this May 2022. Suppose our good citizens (source) have done their part in participating in the elections. Once the ballots are saved, these ballots move to the elections office (channel) for counting. Unfortunately, some evil politician cannot withstand losing. The evil politician bribed (noise) the office to pump up the voting counts for themselves. When the counting finishes, the results turn out to be skewed from what the Filipino nation really chose (receiver). This can be an interesting topic to look into. Can we determine how much noise got into the system?

In summary, the three basic components are the source, channel, and receiver. The source is characterized by some random variable $S$ which is also the source alphabet containing the symbols $\{s_{1},s_{2},s_{3},...,s_{n}\}$ with a probability distribution associated to each outcome. The combinational sequence of symbols creates a message that travels along the channel. The channel is the medium where the message goes through and it is possible that noise can corrupt the message. Some channels can be noiseless where the message will never be corrupted, while some channels can be noisy where some symbols of the source message can be altered or some new symbols can be added to the source message. The receiver retrieves the message at the end of the channel. The receiver is characterized by some random variable $R$ also with the receiver alphabet containing the symbols $\{r_{1},r_{2},r_{3},...,s_{m}\}$ associated to some probability distribution. It is possible for the receiver to receive more symbols than the source (i.e. $n\leq m$ ). If the channel is noiseless then the receiver will obtain the exact outcomes and distributions $R=S$ ; otherwise, either the outcomes or the probability distributions won't be the same: $R\neq S$ .

Binary Symmetric Channels

We will focus on binary channels since almost all computer systems use the binary representations. One of the most common binary channels is the Binary Symmetric Channel (BSC). Figure 2 shows how we draw BSC trees.

Figure 2: Binary symmetric channel (BSC) tree

The source alphabet of a BSC is $S=\{0,1\}$ with probability distribution that consists of the probability of sending a 1 as $P(s=1)=p$ and the probability of sending a 0 as $P(s=0)=1-p$ . The blue arrows model the channel probabilities. The probabilities of receiving the incorrect value when a signal is sent are $P(r=0|s=1)=P(r=1|s=0)=\epsilon$ . In other words, we have the probability of receiving a 1 but a 0 is sent or the probability of receiving a 0 but a 1 is sent. Of course, the probability of receiving the correct values when a signal is sent are $P(r=0|s=0)=P(r=1|s=1)=1-\epsilon$ . Finally, the probability that the received data are $P(r=1)=p+\epsilon -2\epsilon p$ and $P(r=0)=1-p-\epsilon +2\epsilon p$ . Let's tabulate these probabilities:

Component	Probability
$P(s=1)$	$p$
$P(s=0)$	$1-p$
$P(r=0\|s=1)$	$\epsilon$
$P(r=1\|s=0)$	$\epsilon$
$P(r=0\|s=0)$	$1-\epsilon$
$P(r=1\|s=1)$	$1-\epsilon$
$P(r=1)$	$p+\epsilon -2\epsilon p$
$P(r=0)$	$1-p-\epsilon +2\epsilon p$

Calculating

P(r=1)

is easy. It is

P(r=1)=P(r=1|s=1)P(s=1)+P(r=1|s=0)P(s=0)

from our probability review. You can do the same for

P(r=0)

. We'll leave the derivation for you to practice 😊.

We are interested in calculating $H(S)$ , $H(R)$ , $H(R|S)$ , $I(R,S)$ , and $H(R,S)$ . For now, we'll be ignoring $H(S|R)$ because we are much more interested about the information that appears at the receiver.

Calculating source entropy $H(S)$

Since the input is a simple binary source, we use the Bernoulli entropy $H_{b}(p)=-p\log _{2}(p)-(1-p)\log _{2}(p)$ . Therefore:

H(S)=H_{b}(p)=-p\log _{2}(p)-(1-p)\log _{2}(p)

(1)

Sometimes we prefer to combine terms so that equation 1 is simplified into:

$H(S)=p\left[\log _{2}\left({\frac {1-p}{p}}\right)\right]-log_{2}(1-p)$

This reduces terms and provides a more elegant equation; however, equation 1 is much preferred if we are to program it. The simplified version has a problem when $p=0$ creating an undefined (or infinite) value for the fraction. Because of this, we'll prefer to stick with the form of equation 1 so that it's easier to program. For the case when $\log _{2}(0)$ we can always set that in our program to return 0 when this happens. Recall our "fixed" definition for information. One important observation, an obvious one, is that the source entropy is completely independent of noise. Therefore it really is just dependent on the probability distribution of our source alphabet.

Calculating receiver entropy $H(R)$

If you think about it carefully, the receiver side is also a Bernoulli entropy. We know that $P(r=1)=p+\epsilon -2\epsilon p$ and $P(r=0)=1-P(r=1)=1-p-\epsilon +2\epsilon p$ . If we let $q=p+\epsilon -2\epsilon p$ then it's as simple as re-writing the Bernoulli entropy as:

H(R)=H_{b}(q)=-q\log _{2}(q)-(1-q)\log _{2}(q)

(2)

Again, it's simpler to use this equation because we can directly program this. It's interesting to see what happens when we vary $0\leq \epsilon \leq 1$ .

Channeling your inner capacity

Contents

Channel Basics

The Source

The Channel

The Receiver

Binary Symmetric Channels

Calculating source entropy $H(S)$

Calculating receiver entropy $H(R)$

Sample Application: Noisy Images

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

Channeling your inner capacity

Contents

Channel Basics

The Source

The Channel

The Receiver

Binary Symmetric Channels

Calculating source entropy H ( S ) {\displaystyle H(S)}

Calculating receiver entropy H ( R ) {\displaystyle H(R)}

Sample Application: Noisy Images

Navigation menu

Search

Calculating source entropy $H(S)$

Calculating receiver entropy $H(R)$