[time 246] Discusson with Mackey


Stephen P. King (stephenk1@home.com)
Sun, 18 Apr 1999 13:58:41 -0400


>Date: Wed, 23 Sep 1998 22:22:25 -0400
>To: Michael C. Mackey <mackey@cnd.mcgill.ca>
>From: Stephen Paul King <spking1@mindspring.com>
>Subject: "Mackey's mistake"
>
>Dear Prof. Mackey,
Return-Path: <mackey@mines.cnd.mcgill.ca>
Date: Thu, 24 Sep 1998 21:55:40 -0400 (EDT)
From: Michael Mackey <mackey@cnd.mcgill.ca>
To: Stephen Paul King <spking1@mindspring.com>
Subject: Re: "Mackey's mistake"

Dear Stephen King,

Thank you for your question. This is not the first time that I have had
this directed at me, the first being last May when it was pointed out
that Mr. Hillman had posted a comment about our (Lasota and my) work
that
was certainly open to interpretation. Most people interpreted the
comment to mean that we had made obvious mistakes.

I will try to reply to your question about Hillman's comments as clearly
and
simply as possible about how I view the situation.

As I read the comments that Hillman sent you, there seem to be two areas
of disagreement:

1. Hillman objects that we have picked a definition for entropy when
the
state variable (x) is continuous, e.g.

H(f) = -\int f(x) log f(x) dx

that is incompatible with the discrete quantity

H(p) = -sum p_j log p_j

However, if you go to our Chapter 9 of Chaos, Fractals and Noise, we
NEVER claimed that this was the case. Nor, as he claims, did we ever
claim that our conditional entropy was "an integral analog of ... ".
Simply not true. We simply DEFINED quantities (entropy and conditional
entropy) and then set out to prove properties that they satisfied.

2. Hillman's other objection is that we have picked the "incorrect"
definitions for entropy and conditional entropy. Incorrect by whose
definition?

Our view is the following (one can take any point of view that they
wish, but
usually one is picked with some justification.) The concept of entropy
was
developed in PHYSICS, not mathematics. We defined the Boltzman-Gibbs
entropy
in complete analogy with the definitions of Boltzman and Gibbs in their
attempts to find dynamical analogs of thermodynamic behaviour. Namely,
if
one consults:

>L. Boltzman, "Lectures on Gas Theory", reprinted by Dover (1995)
there one finds a quantity H defined (page 50) in terms of the single
particle density and then this is related by Boltzman to the entropy in
the note on page 133 by "-H = entropy". Rewriting Boltzman's
expressions
one obtains an expression that looks like our definition of entropy (but
one must be careful about the interpretation of "f"--see below);

>J.W. Gibbs, "Elementary Principles in Statistical Mechanics",
reprinted by Dover (1960) you will find extensive treatment of what he
calls the "index of probability" which, in modern terms, is the log of
the density function f, i.e. "index of probability - -log f". Gibbs on
page 44 says "the average index of probability with its sign
reversed corresponds to entropy". Now the average of the index of
probability with its sign reversed is simply

- \int f(x) log f(x) dx

which is precisely the definition of entropy that is used in Lasota and
Mackey, and in my 1992 book "Time's Arrow" published by Springer.

>It is important to realize when reading Boltzman and Gibbs that they
have two very different interpretations of what "f" means--and this is
the
basis of the fundamental philosophical differences between the two
approaches. E.T. Jaynes has pointed out these differences very clearly
in an
article in the American Journal of Physics in the early 1960's (I think
the
reference is AJP (1963), vol 31, page 66 et seq. but I don't have the
paper
here at home. It is entitled "Boltzman versus Gibbs entropy" or
something
like that--if you can't find it, write and Ill get the exact reference)
where
he showed that the Gibbs defintion of entropy gives correct results for
a gas
of interacting particle, whereas the Boltzman result is in error because
it
neglects the interaction energies between molecules. It is only in the
case
of non-interacting particles that the two definitions of entropy give
identical results.

>Numerous other more recent derivative sources in the physics
literature can be consulted that use the same sign conventions for the
entropy. It is only when one strays into the mathematics literature
that
things become altered.

3. The issue of the sign convention in the definition of the
conditional
entropy follows directly from the definition adopted in the definition
of
entropy. If you wish to have them agree in obvious cases, the the
conditional entropy must also have a negative sign.

This, then, I hope explains what I feel are the differences that Hillman
and
we have in our outlook. If any point is unclear, please feel free to
write
back and I will try to clarify it. Lasota and I have taken a point of
view
derived from the origins of the entropy concept--namely the attempt that
started with Hemholtz, Jeans, Boltzman, Clausius, Gibbs and others to
find a
satisfactory dynamical foundation for the laws of thermodynamics.

I want to thank you for writing and asking me to respond to Hillman's
comments. I personally find it disturbing to have my work labeled as
incorrect or misleading in public (as was the case last May) and private
forums (as in the reply he wrote to you) without being given the chance
to
respond directly. If you are in agreement, I would like to forward a
copy of
this note to Hillman.

On another issue, I was intrigued by your comment at the end of your
note
of yesterday evening about your work with Kitada. If, and when, you
feel
like sharing it I would be interested to know what you are doing.

Kind regards,

Michael Mackey

On Wed, 23 Sep 1998, Stephen Paul King wrote:

> Dear Prof. Mackey,
>
> What do you make of this?
> ---
> Return-Path: <hillman@math.washington.edu>
> Date: Fri, 27 Mar 1998 11:25:49 -0800 (PST)
> From: Chris Hillman <hillman@math.washington.edu>
> To: Stephen Paul King <spking1@mindspring.com>
> Subject: Re: Mackey's mistake
>
>
> On Fri, 27 Mar 1998, Stephen Paul King wrote:
>
> > Dear Chris,
> >
> > Could you elaborate on Mackey and Lasota's mistake? I have read most of
> > his papers and have talked to Mackey a little about his ideas, and I would
> > like to know were he is going wrong. I really appreciate your interest in
> > the study of entropy and your thoughful replies to my querries on the news
> > groups.
>
> I like to say that "all theorems are either wrong or misunderstood".
>
> Their error is one of -interpretation-. I am never suprised when a
> mathematician makes an error of interpretation (these are often minor) but
> this one is quite serious and very elementary, which makes it all the more
> astonishing. (I'd prefer to believe its an error rather than a deliberate
> misrepresentation; possibly one of the authors chose to misrepresent
> something without realizing the gravity of their error of omission.)
>
> We're talking about section 9.2 of Chaos, Fractals and Noise. Definition
> 9.2.1 has the wrong sign. Lasota & Mackey define "conditional entropy"
> to be an integral analog of the quantity
>
> H(p|q) = sum p_j log (q_j/p_j)
>
> Lets talk about this discrete quantity for a moment.
>
> First, it should be called something else to avoid confusion with the
> conditional entropies of Shannon. For instance, Cover & Thomas, Elements
> of Information Theory, Wiley, 1991, call it "relative entropy" and define
> it correctly, as do all the other books I've seen. Other authors call it
> "cross entropy" or "divergence" or "discrimination" or "Kullback-Liebler
> entropy".
>
> Second, it is not obvious, but if defined with the correct sign,
>
> D(p||q) = sum p_j log (p_j/q_j) = -H(p|q)
>
> is non-negative and moreover -decreases- (i.e. gets closer to zero from
> above) under various operations. See Cover & Thomas.
>
> Third, D(p||q) was introduced by Kullback in a context where it is clear
> that it should be a positive quantity with the opposite sign from the
> quantity defined by L&M, and -every- author since then -but- L&M define it
> with that sign (I must have looked at hundreds of books and papers which
> discuss this very interesting quantity). You can easily check this; the
> book by Kullback on statistics and information theory has just been
> reprinted and I recently saw it in a Borders Books near Washington, DC.
> The best interpretion (and I know about five) of divergence is in terms of
> the theory of types, and is explained in Cover & Thomas. This
> interpretation is very clear and obviously "correct".
>
> Fourth, notice that if written in the form I've given, if you goof and
> write q/p when you mean p/q, you change the sign.
>
> Now, Lasota & Mackey are talking about
>
> D(f||g) = int f log f/g
>
> (defined with the "wrong" sign). It turns out that
>
> H(f) = -int f log f
>
> is NOT analogous to
>
> H(p) = -sum p_j log p_j
>
> but the integral divergence IS the proper analog of the discrete
> divergence. This is a long, long story I don't have time to go into now,
> but some idea might be gained from looking at the book by Guiasu,
> Information Theory and its Applications.
>
> The point is, if you change the sign, you pass from a positive quantity
> which decreases towards zero under various operations, in particular
>
> D(Pf||Pg) <= D(f||g) (*)
>
> where P is a Perron-Frobenius operator.
>
> Now, Lasota and Mackey say that their quantity obeys the law
>
> H(Pf||Pg) >= H(f||g)
>
> True, but it is very seriously misleading to fail to point out
> that this is a -negative- quantity increasing towards zero; this leaves
> the impression this law "justifies" the Second Law. In fact it is
> something quite different. Their sign change obscures the correct
> interpretion of divergence according to the theory of types (which makes
> the law (*) quite intuitive).
>
> There's a lot more to this (I've thought hard, though not for several
> years, about the interpretation of divergence) but for the moment this
> will have to suffice, since I'm working hard to finish my thesis by June
> :-) Time permitting, if I get an academic job, I plan to write up a more
> complete critique and mail it to them.
>
> Chris Hillman
>
> ---
>
> I am having difficulties understanding what to make of these statements. I
> am working with Hitoshi Kitada on a model of time, and am trying to see if
> our model satisfies your "exactness" criterion. Thank you for your time. :)
>
> Kind regards,
>
> Stephen Paul King



This archive was generated by hypermail 2.0b3 on Sun Oct 17 1999 - 22:31:52 JST