Information algebra explained

The term "information algebra" refers to mathematical techniques of information processing. Classical information theory goes back to Claude Shannon. It is a theory of information transmission, looking at communication and storage. However, it has not been considered so far that information comes from different sources and that it is therefore usually combined. It has furthermore been neglected in classical information theory that one wants to extract those parts out of a piece of information that are relevant to specific questions.

A mathematical phrasing of these operations leads to an algebra of information, describing basic modes of information processing. Such an algebra involves several formalisms of computer science, which seem to be different on the surface: relational databases, multiple systems of formal logic or numerical problems of linear algebra. It allows the development of generic procedures of information processing and thus a unification of basic methods of computer science, in particular of distributed information processing.

Information relates to precise questions, comes from different sources, must be aggregated, and can be focused on questions of interest. Starting from these considerations, information algebras are two-sorted algebras

(\Phi,D)

Where

\Phi

is a semigroup, representing combination or aggregation of information, and

is a lattice of domains (related to questions) whose partial order reflects the granularity of the domain or the question, and a mixed operation representing focusing or extraction of information.

Information and its operations

More precisely, in the two-sorted algebra

(\Phi,D)

, the following operations are defined

Combination :

⊗ :\Phi ⊗ \Phi → \Phi,~(\phi,\psi)\mapsto\phi ⊗ \psi

Focusing :

⇒ :\Phi ⊗ D → \Phi,~(\phi,x)\mapsto\phi^⇒

Additionally, in

the usual lattice operations (meet and join) are defined.

Axioms and definition

The axioms of the two-sorted algebra

(\Phi,D)

, in addition to the axioms of the lattice

Semigroup :

\Phi

is a commutative semigroup under combination with a neutral element (representing vacuous information).

Distributivity of Focusing over Combination :

(\phi^⇒ ⊗ \psi)^⇒=\phi^⇒ ⊗ \psi^⇒

To focus an information on

combined with another information to domain

, one may as well first focus the second information to

and then combine.

Transitivity of Focusing :

(\phi^⇒)^⇒=\phi^⇒

To focus an information on

and

, one may focus it to

x\wedgey

Idempotency :

\phi ⊗ \phi^⇒=\phi

An information combined with a part of itself gives nothing new.

Support :

\forall\phi\in\Phi,~\existsx\inD

such that

\phi=\phi^⇒

Each information refers to at least one domain (question).

A two-sorted algebra

(\Phi,D)

satisfying these axioms is called an Information Algebra.

Order of information

A partial order of information can be introduced by defining

\phi\leq\psi

\phi ⊗ \psi=\psi

. This means that

\phi

is less informative than

\psi

if it adds no new information to

\psi

. The semigroup

\Phi

is a semilattice relative to this order, i.e.

\phi ⊗ \psi=\phi\vee\psi

. Relative to any domain (question)

x\inD

a partial order can be introduced by defining

\phi\leq_x\psi

\phi^⇒\leq\psi^⇒

. It represents the order of information content of

\phi

and

\psi

relative to the domain (question)

Labeled information algebra

The pairs

(\phi,x)

, where

\phi\in\Phi

and

x\inD

such that

\phi^⇒=\phi

form a labeled Information Algebra. More precisely, in the two-sorted algebra

(\Phi,D)

, the following operations are defined

Labeling :

d(\phi,x)=x

Combination :

(\phi,x) ⊗ (\psi,y)=(\phi ⊗ \psi,x\veey)~~~~

Projection :

(\phi,x)^\downarrow=(\phi^⇒,y)fory\leqx

Models of information algebras

Here follows an incomplete list of instances of information algebras:

Relational algebra

The reduct of a relational algebra with natural join as combination and the usual projection is a labeled information algebra, see Example.

Constraint systems: Constraints form an information algebra .
Semiring valued algebras: C-Semirings induce information algebras ;;.
Logic

Many logic systems induce information algebras . Reducts of cylindric algebras or polyadic algebras are information algebras related to predicate logic .

Module algebras: ;.
Linear systems: Systems of linear equations or linear inequalities induce information algebras .

Worked-out example: relational algebra

Let

{lA}

be a set of symbols, called attributes (or column names). For each

\alpha\in{lA}

let

U_\alpha

be a non-empty set, the set of all possible values of the attribute

\alpha

. For example, if

{lA}=\{tt{name},tt{age},tt{income}\}

, then

U_tt{name

} couldbe the set of strings, whereas

U_tt{age

} and

U_tt{income

} are both the set of non-negative integers.

Let

x\subseteq{lA}

. An x

-tuple is a function

so that

\hbox{dom}(f)=x

and

f(\alpha)\inU_\alpha

for each

\alpha\inx

The setof all

-tuples is denoted by

E_x

. For an

-tuple

and a subset

y\subseteqx

the restriction

f[y]

is defined to be the

-tuple

so that

g(\alpha)=f(\alpha)

for all

\alpha\iny

A relation

over

is a set of
x

-tuples, i.e. a subset of
E_x

.The set of attributes
x

is called the domain of
R

and denoted by
d(R)

. For
y\subseteqd(R)

the projection of
R

onto
y

is definedas follows:

\pi_{y(R):=\{f[y]\mid}f\inR\}.

The join of a relation

over

and a relation

over

isdefined as follows:

R\bowtieS:=\{f\midf (x\cupy)\hbox{-tuple}, f[x]\inR, f[y]\inS\}.

As an example, let

and

be the following relations:

R= \begin{matrix} tt{name}&tt{age}\\ tt{A}&tt{34}\\ tt{B}&tt{47}\\ \end{matrix} S= \begin{matrix} tt{name}&tt{income}\\ tt{A}&tt{20'000}\\ tt{B}&tt{32'000}\\ \end{matrix}

Then the join of

and

is:

R\bowtieS= \begin{matrix} tt{name}&tt{age}&tt{income}\\ tt{A}&tt{34}&tt{20'000}\\ tt{B}&tt{47}&tt{32'000}\\ \end{matrix}

A relational database with natural join

\bowtie

as combination and the usual projection

\pi

is an information algebra.The operations are well defined since

d(R\bowtieS)=d(R)\cupd(S)

x\subseteqd(R)

, then

d(\pi_x(R))=x

.It is easy to see that relational databases satisfy the axioms of a labeledinformation algebra:

semigroup :

(R_1\bowtieR_2)\bowtieR_3=R_1\bowtie(R_2\bowtieR₃₎

and

R\bowtieS=S\bowtieR

transitivity : If

x\subseteqy\subseteqd(R)

, then

\pi_x(\pi_y(R))=\pi_x(R)

combination : If

d(R)=x

and

d(S)=y

, then

\pi_x(R\bowtieS)=R\bowtie\pi_x\cap(S)

idempotency : If

x\subseteqd(R)

, then

R\bowtie\pi_x(R)=R

support : If

x=d(R)

, then

\pi_x(R)=R

Connections

Valuation algebras : Dropping the idempotency axiom leads to valuation algebras. These axioms have been introduced by to generalize local computation schemes from Bayesian networks to more general formalisms, including belief function, possibility potentials, etc. . For a book-length exposition on the topic see .

Historical Roots

The axioms for information algebras are derived from the axiom system proposed in (Shenoy and Shafer, 1990), see also (Shafer, 1991).

References

- - - - Book: P. P. . Shenoy . G. . Shafer . Axioms for probability and belief-function proagation . Ross D. Shachter . Tod S. Levitt . Laveen N. Kanal . John F. Lemmer . Uncertainty in Artificial Intelligence 4 . 9 . Machine Intelligence and Pattern Recognition . 169–198 . Amsterdam . 1990 . Elsevier . 978-0-444-88650-7. 10.1016/B978-0-444-88650-7.50019-6 . 1808/144 . free .