Arbitrarily varying channel explained

An arbitrarily varying channel (AVC) is a communication channel model used in coding theory, and was first introduced by Blackwell, Breiman, and Thomasian. This particular channel has unknown parameters that can change over time and these changes may not have a uniform pattern during the transmission of a codeword.

stylen

uses of this channel can be described using a stochastic matrix

styleWn:Xn x

styleSnYn

, where

styleX

is the input alphabet,

styleY

is the output alphabet, and

styleWn(y|x,s)

is the probability over a given set of states

styleS

, that the transmitted input

stylex=(x1,\ldots,xn)

leads to the received output

styley=(y1,\ldots,yn)

. The state

stylesi

in set

styleS

can vary arbitrarily at each time unit

stylei

. This channel was developed as an alternative to Shannon's Binary Symmetric Channel (BSC), where the entire nature of the channel is known, to be more realistic to actual network channel situations.

Capacities and associated proofs

Capacity of deterministic AVCs

An AVC's capacity can vary depending on the certain parameters.

styleR

is an achievable rate for a deterministic AVC code if it is larger than

style0

, and if for every positive

style\varepsilon

and

style\delta

, and very large

stylen

, length-

stylen

block codes exist that satisfy the following equations:
style1
n

logN>R-\delta

and

\displaystyle

max
s\inSn

\bar{e}(s)\leq\varepsilon

, where

styleN

is the highest value in

styleY

and where

style\bar{e}(s)

is the average probability of error for a state sequence

styles

. The largest rate

styleR

represents the capacity of the AVC, denoted by

stylec

.

As you can see, the only useful situations are when the capacity of the AVC is greater than

style0

, because then the channel can transmit a guaranteed amount of data

style\leqc

without errors. So we start out with a theorem that shows when

stylec

is positive in an AVC and the theorems discussed afterward will narrow down the range of

stylec

for different circumstances.

Before stating Theorem 1, a few definitions need to be addressed:

\displaystyle\sumsW(y|x,s)U(s|x')=\sumsW(y|x',s)U(s|x)

for every

style(x,x',y,s)

, where

stylex,x'\inX

,

styley\inY

, and

styleU(s|x)

is a channel function

styleU:XS

.

styleXr

,

styleSr

, and

styleYr

are all random variables in sets

styleX

,

styleS

, and

styleY

respectively.
styleP
Xr

(x)

is equal to the probability that the random variable

styleXr

is equal to

stylex

.
styleP
Sr

(s)

is equal to the probability that the random variable

styleSr

is equal to

styles

.
styleP
XrSrYr
is the combined probability mass function (pmf) of
styleP
Xr

(x)

,
styleP
Sr

(s)

, and

styleW(y|x,s)

.
styleP
XrSrYr
is defined formally as
styleP
XrSrYr

(x,s,y)=

P
Xr
(x)P
Sr

(s)W(y|x,s)

.

styleH(Xr)

is the entropy of

styleXr

.

styleH(Xr|Yr)

is equal to the average probability that

styleXr

will be a certain value based on all the values

styleYr

could possibly be equal to.

styleI(Xr\landYr)

is the mutual information of

styleXr

and

styleYr

, and is equal to

styleH(Xr)-H(Xr|Yr)

.

\displaystyleI(P)=

min
Yr

I(Xr\landYr)

, where the minimum is over all random variables

styleYr

such that

styleXr

,

styleSr

, and

styleYr

are distributed in the form of
styleP
XrSrYr
.

Theorem 1:

stylec>0

if and only if the AVC is not symmetric. If

stylec>0

, then

\displaystylec=maxPI(P)

.

Proof of 1st part for symmetry: If we can prove that

styleI(P)

is positive when the AVC is not symmetric, and then prove that

stylec=maxPI(P)

, we will be able to prove Theorem 1. Assume

styleI(P)

were equal to

style0

. From the definition of

styleI(P)

, this would make

styleXr

and

styleYr

independent random variables, for some

styleSr

, because this would mean that neither random variable's entropy would rely on the other random variable's value. By using equation
styleP
XrSrYr
, (and remembering
styleP
Xr

=P

,) we can get,

\displaystyle

P
Yr

(y)=\sumx\in\sums\in

P(x)P
Sr

(s)W(y|x,s)

style\equiv(

since

styleXr

and

styleYr

are independent random variables,

styleW(y|x,s)=W'(y|s)

for some

styleW')

\displaystyle

P
Yr

(y)=\sumx\in\sums\in

P(x)P
Sr

(s)W'(y|s)

style\equiv(

because only

styleP(x)

depends on

stylex

now

style)

\displaystyle

P
Yr

(y)=\sums\in

P
Sr

(s)W'(y|s)\left[\sumx\inP(x)\right]

style\equiv(

because

\displaystyle\sumx\inP(x)=1)

\displaystyle

P
Yr

(y)=\sums\in

P
Sr

(s)W'(y|s)

So now we have a probability distribution on

styleYr

that is independent of

styleXr

. So now the definition of a symmetric AVC can be rewritten as follows:

\displaystyle\sums

W'(y|s)P
Sr

(s)=\sums

W'(y|s)P
Sr

(s)

since

styleU(s|x)

and

styleW(y|x,s)

are both functions based on

stylex

, they have been replaced with functions based on

styles

and

styley

only. As you can see, both sides are now equal to the
styleP
Yr

(y)

we calculated earlier, so the AVC is indeed symmetric when

styleI(P)

is equal to

style0

. Therefore,

styleI(P)

can only be positive if the AVC is not symmetric.

Proof of second part for capacity: See the paper "The capacity of the arbitrarily varying channel revisited: positivity, constraints," referenced below for full proof.

Capacity of AVCs with input and state constraints

The next theorem will deal with the capacity for AVCs with input and/or state constraints. These constraints help to decrease the very large range of possibilities for transmission and error on an AVC, making it a bit easier to see how the AVC behaves.

Before we go on to Theorem 2, we need to define a few definitions and lemmas:

For such AVCs, there exists:

- An input constraint

style\Gamma

based on the equation

\displaystyleg(x)=

1
n
n
\sum
i=1

g(xi)

, where

stylex\inX

and

stylex=(x1,...,xn)

.

- A state constraint

styleΛ

, based on the equation

\displaystylel(s)=

1
n
n
\sum
i=1

l(si)

, where

styles\inX

and

styles=(s1,...,sn)

.

-

\displaystyleΛ0(P)=min\sumxP(x)l(s)

-

styleI(P,Λ)

is very similar to

styleI(P)

equation mentioned previously,

\displaystyleI(P,Λ)=

min
Yr

I(Xr\landYr)

, but now any state

styles

or

styleSr

in the equation must follow the

stylel(s)\leqΛ

state restriction.

Assume

styleg(x)

is a given non-negative-valued function on

styleX

and

stylel(s)

is a given non-negative-valued function on

styleS

and that the minimum values for both is

style0

. In the literature I have read on this subject, the exact definitions of both

styleg(x)

and

stylel(s)

(for one variable

stylexi

,) is never described formally. The usefulness of the input constraint

style\Gamma

and the state constraint

styleΛ

will be based on these equations.

styleR

is now limited to codewords of format

stylex1,...,xN

that satisfy

styleg(xi)\leq\Gamma

, and now the state

styles

is limited to all states that satisfy

stylel(s)\leqΛ

. The largest rate is still considered the capacity of the AVC, and is now denoted as

stylec(\Gamma,Λ)

.

Lemma 1: Any codes where

styleΛ

is greater than

styleΛ0(P)

cannot be considered "good" codes, because those kinds of codes have a maximum average probability of error greater than or equal to
styleN-1
2N

-

1
n
2
l
max
n(Λ-
2
Λ
0(P))
, where

stylelmax

is the maximum value of

stylel(s)

. This isn't a good maximum average error probability because it is fairly large,
styleN-1
2N
is close to
style1
2
, and the other part of the equation will be very small since the

style(Λ-Λ0(P))

value is squared, and

styleΛ

is set to be larger than

styleΛ0(P)

. Therefore, it would be very unlikely to receive a codeword without error. This is why the

styleΛ0(P)

condition is present in Theorem 2.

Theorem 2: Given a positive

styleΛ

and arbitrarily small

style\alpha>0

,

style\beta>0

,

style\delta>0

, for any block length

stylen\geqn0

and for any type

styleP

with conditions

styleΛ0(P)\geqΛ+\alpha

and

\displaystyleminxP(x)\geq\beta

, and where
styleP
Xr

=P

, there exists a code with codewords

stylex1,...,xN

, each of type

styleP

, that satisfy the following equations:
style1
n

logN>I(P,Λ)-\delta

,

\displaystylemaxl(s)\bar{e}(s)\leq\exp(-n\gamma)

, and where positive

stylen0

and

style\gamma

depend only on

style\alpha

,

style\beta

,

style\delta

, and the given AVC.

Proof of Theorem 2: See the paper "The capacity of the arbitrarily varying channel revisited: positivity, constraints," referenced below for full proof.

Capacity of randomized AVCs

The next theorem will be for AVCs with randomized code. For such AVCs the code is a random variable with values from a family of length-n block codes, and these codes are not allowed to depend/rely on the actual value of the codeword. These codes have the same maximum and average error probability value for any channel because of its random nature. These types of codes also help to make certain properties of the AVC more clear.

Before we go on to Theorem 3, we need to define a couple important terms first:

\displaystyleW\zeta(y|x)=\sumsW(y|x,

s)P
Sr

(s)


styleI(P,\zeta)

is very similar to the

styleI(P)

equation mentioned previously,

\displaystyleI(P,\zeta)=

min
Yr

I(Xr\landYr)

, but now the pmf
styleP
Sr

(s)

is added to the equation, making the minimum of

styleI(P,\zeta)

based a new form of
styleP
XrSrYr
, where

styleW\zeta(y|x)

replaces

styleW(y|x,s)

.

Theorem 3: The capacity for randomized codes of the AVC is

\displaystylec=maxPI(P,\zeta)

.

Proof of Theorem 3: See paper "The Capacities of Certain Channel Classes Under Random Coding" referenced below for full proof.

See also

References