Evolution in a nutshell

an altrnative outline on evoution

and some consequences concerning valuations

by

Gregor Kjellström

 

            References          

4.1 The mean value of information

Definition 4.2. The mean value of information is defined as

     H = - å pj log(pj)    provided that å pj = 1,

and if j = 1, 2, ..., r and all pj are equal, then H = log(r).

So as to pass on to the continuous case we may replace p by a continuous p .d. f.  f(x) of a stochastic variable X. Then we get

     H(X) = - ò f(x) log[ f(x) dx ] dx.

A problem here is that log[ dx ] makes H uncertain and dependent on the value of the indefinitely small dx

Definition 4.3. In order to avoid this problem, we prefer to define H in such a way that if f(x) is a uniform p. d. f. over some volume V in parameter space, then

     H = log(V).   (in analogy with definition 4.2)

This is accomplished if

     H(X) = - ò f(x) log[f(x) ] dx.

Because we have H(X) =  - òV f(x) log[f(x) ] dx =

      òV log[V ]/V dx = log(V).

Theorem 4.1. H is an additive measure: i. e. if X and Y are stochastic variables, then

1. H(X) increases with the number of equally probable events.

2.  H(XY)  = H(X) + H(Y)  if X and Y are independent stochastic variables.

3.  H(XY) = H(X) + H(Y½X)      if Y depends on X.

The third statement makes it possible to handle cases where different parts of a system or a message depend on each other. For the definition of H(Y½X) see the proof of the third statement below. For the proof the following  definitions are necessary:

Definitions 4.4:

     f(x, y)        is the joint p. d. f. of the two parameter values x and y.

     f(y½x)       is the p. d. f. of the parameter y when x is given.

     f(x) = ò f(x, y) dy  is the p. d. f. in x regardless the value of y.

We also have

     ò f(x, y) dx dy = ò f( y½x) dy = ò f(x) dx = 1   and

     f(x, y) = f(x) f( y½x).

Proof: The first statement follows immediately from the definitions 4.2 and 4.3.

For the second statement, let x and y be single independent parameter values and

     H(X) = - ò f(x) log[ f(x) ] dx  and H(Y) = - ò g(y) log[ g(y) ] dy

be two independent measures of mean information. Then the sum of these measures is

     H(X) + H(Y) = - ò f(x) log[ f(x) ] dx  - ò g(y) log[ g(y) ] dy

     = - ò g(y) dy ò f(x) log[ f(x) ] dx - ò f(x) dx ò g(y) log[ g(y) ] dy

     = - ò ò f(x) g(y) { log [ f(x) ] + log[ g(y) ] } dx dy

     = - ò ò f(x) g(y) { log[ f(x) g(y) ] } dx dy = H(XY)

which proves the second statement.

     In order to prove the third statement we observe that

     H(Y½X) = - ò f(x) { ò f(y½x) log[ f(y½x) ] dy } dx

     = - ò ò f(x, y) { log[ f(x, y) ] - log[ f(x) ] } dx dy

     = - ò ò f(x, y) log[ f(x, y) ] dx dy + ò ò f(x, y) log[ f(x) ] dx dy

     = - ò ò f(x, y) log[ f(x, y) ] dx dy + ò f(x) log[ f(x) ] dx

     = H(XY) – H(X)

which proves the theorem.