Physics/A/Shannon entropy

From testwiki
Revision as of 10:07, 14 April 2024 by imported>Federhalter (Spelling)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Subpages

Simple information entropy

2S=ΩS=log2Ω
# coins
S
Entropy
# messages
Ω
Omega
2
4=22=22
3
8=23=222
4
16=24
5
32=25
Table 1: More than two coins
S= entropy; Ω= the number
of possible messages
Fig. 1: Entropy of 2 coins
Figure 2. Elephant with 5 bits of entropy can give 3 bits to Bird and 2 bits to Rat.
Figure 3: Rat could display the same information using a single 4-sided die.

Figure 1 illustrates the fact that flipping two coins can yield four get four different outcomes (represented as four pairs of flipped coins.) Using (Template:Math,Template:Math) to represent (heads,tails) we can list these four outcomes,[1]

Template:Spaces Template:Math, Template:Math, Template:Math, and Template:Math,

We shall use the symbol Template:Math (Omega) to denote the number of messages that can be sent by these two coins:

Template:Spaces Template:Math

The popularity of this inspired me to seek other ways to use coins to understand what is known as Shannon's entropy. Table 1 extends Figure 1 to include the relationship between entropy S and the number of messages Ω (called Omega) one might generate by displaying Template:Math coins as either "heads" or "tails". But neither the table, the figure, nor even the formula,

Template:SpacesS=log2Ω,

captures the full complexity associated with Shannon's information entropy. A hint at this complexity can be seen in the following question:

If entropy is equivalent to the number of coins used to convey information, how should one deal with a "one-sided coin?

Such a question must be viewed outside the scope of mathematics. The fact that one rarely hears "fire!" in a crowded theater does not remove that word from the lexicon of how the audience should behave in a theater. If coins<rev>i.e. a base-1 "alphabet"</ref> Shannon's entropy, H, was defined so coins contribute nothing to the entropy

Define S as simple entropy and note that it is the number of "coins" used. Shanon's entropy, H, defined so that it also depends on the probability with which a coin is used. Connect probability to frequency.

HS=N

Additive property I

Figure 2 illustrations how Elephant communicates with the animal kingdom every morning by displaying 5 coins, using his trunk to indicate where the sequence begins. As shown in Figure 3, Elephant asks Crow to display the first 3 coins and Rat to display the last two coins. When acting on behalf of Elephant, Crow and Rat each point to the first coin in their message, and the animal kingdom understands that Crow is to be read first. In this case Crow and Rat are not sending independent signals, but are instead following Elephant's instructions. On the other hand, if Crow and Rat are act independently, Crow controls 3 bits of entropy, while Rat controls 2 bits. The relationship between acting dependently and independently can be summarized as:

Elephants total, or net, entropy STot equals the sum of the entropies controlled by Crow and Rat, but the Template:Math number of messages Elephant can send equals the product of what Crow and Rat could send if they act independently:

Template:SpacesSTot=SCrow+SRat Template:Spaces Template:SpacesΩTot=ΩCrowΩRat

Entropy is independent of the "alphabet" or "language" used

Neither our simple entropy S, nor the Shannon entropy H do does not distinguish between the "language" or "alphabets" one might use when sending messages. Rat could equivalently send the messages by displaying two coins.[2] Or messages could be expressed using binary numbers,[3] or even four Arabic numbers using the four-sided die shown in Figure 3.

First derivation of Shannon entropy

H=pjlog2(pj1)all possible messagespjlog2(pj)all possible messages

Expectation value reviewed

The expectation, or "average" value of the four integers: {1,2,2,5}

E(x)=1+2+2+54=2.5=141+122+145

The fractions Template:Math,Template:Math, and Template:Math, refer to the probabilities of Template:Math being Template:Math, Template:Math, and Template:Math, respectively. This permits us to write the expectation value as a sum involving all probabilities:

E(x)=1p(x=1)+2p(x=2)+5p(x=5)xpxx

Additive property revisited

HT=k=1NTpklogpk, where NT=N1N2, and pk=pαpβ,

where the logarithm's base is two: logpklog2(pk).

HT=i=1N1i=jN2(pipk)log(pipk) =i=1N1i=jN2pipk(logpi+logpk)logpipj =i=1N1i=jN2(pipjlogpi+pipjlogpj) =(i=jN2pj)=1i=1N1pilogpi+(i=1N1pi)=1j=1N2pjlogpj =i=1N1pilogpi+j=1N2pjlogpj =H1H2

log2

HT=k=1NTpklog2pk, where NT=N1N2, and pk=pαpβ,

where the logarithm's base is two: log2pklog2(pk).

HT=i=1N1i=jN2(pipk)log2(pipk) =i=1N1i=jN2pipk(log2pi+log2pk)log2pipj =i=1N1i=jN2(pipjlog2pi+pipjlog2pj) =(i=jN2pj)=1i=1N1pilog2pi+(i=1N1pi)=1j=1N2pjlog2pj =i=1N1pilog2pi+j=1N2pjlog2pj =H1H2

Wheel of fortune

Shannon's entropy for unfair coin

ST=log(nC+nR)=HCR+nCnC+nRpClognC+nRnC+nRpRlognR

HCR=log(nC+nR)nCnC+nRpClognCnRnC+nRpRlognR=log(nC+nR)pClognCpRlognR

Use fact that, pC+pR=1, to write this as.

HCR=(pC+pR)log(nC+nR)pClognCpRlognR =pC(log(nC+nR)lognClog(1/pC))pR(log(nC+nR)lognRlog(1/pR)) =pClog(1/pC)+pRlog(1/pR)

Another path to Shannon entropy

Move

Moved to Draft:Information theory/Permutations and combinations: If unlabeled: Ω=5!=120outcomes, If labeled: Ω=(2+3)!2!3!=10outcomes


5 choose 3 versus permutation of ordered set of five objects.

Don't move

Employ fact that Ω=2S is often a very large number, since Template:Math is often a large number. A simplified version of Sterling's formula, n!nlnnn becomes for logarithms to base two:

log(n!)nlognnln2 log(n!)nlognn/ln2

n=nC+nR

S~=log[(nC+nR)!nC!nR!]

(nC+nR)log(nC+nR)nClognCnRlognR

=nC[log(nC+nR)lognC]+nR[log(nC+nR)lognR]

S~=nClog(nnC)+nRlog(nnR)if n>>1

table

Information entropy for collection of one-sided coins to simulate random events
Entropy per coin (n2=2n1)
n1+n2 Excel Sterling
3 0.5283 0.5739
6 0.6511 0.6628
12 0.7459 0.7489
24 0.8120 0.8127
48 0.8549 0.8551
96 0.8814 0.8815
192 0.8973 0.8973
384 0.9065 0.9065
768 0.9117 0.9117
1536 0.9147
3072 0.9163
6144 0.9172
12288 0.9177
36864 0.9181
147456 0.9182
0.9183

gamma from wikipedia

w:special:permalink/1040737805#Versions_suitable_for_calculators

n!=Γ(n+1),
π(xe)x(8x3+4x2+x+1100)1/6<Γ(1+x)<π(xe)x(8x3+4x2+x+130)1/6.

See also


  1. Figure 1 is currently being used by Wikipedias in ten other languages, and a brief visit to some of them serves to remind reader that humans communicate using a variety of alphabets: الصفحة_الرئيسية (Arabic)Template:SpacesTemplate:SpacesČeská (Chech)Template:SpacesTemplate:SpacessDansk (Danish)Template:SpacesTemplate:SpacesEnglishTemplate:SpacesTemplate:Spaceseuskarazko (Basque)Template:SpacesTemplate:Spacesویژه:آمار (Persian)Template:SpacesTemplate:Spaces코리안(Korean)Template:SpacesTemplate:Spacesਪੰਜਾਬੀ (Punjabi)Template:SpacesTemplate:Spacesјезик (Serbian)Template:SpacesTemplate:SpacesУкраїнці (Ukrainian)Template:SpacesTemplate:Spaces中文 (Chinese)
  2. ...with the understanding that order matters: HT and TH are different messages.
  3. 00, 01, 10, 11 (with perhaps the understanding that 0 means "tails" and 1 denotes "heads"

EVERYTHING BELOW THIS IS SCRAPBOOK

Images used in this draft

Template:EquationRef: I hope I can use this here. There is no other page that might need it..

Two coin examples

TTTHHTHHH=p1log2(1/p1)+p2log2(1/p2)+p3log2(1/p3)+p4log2(1/p4)1=0+0+12log2(2)+12log2(2)1.918...=1312log2(6)+1312log2(6)+2312log2(3)+2312log2(3)1.836...=1313log2(9)+1323log2(9/2)+2313log2(9/2)+2323log2(9/4)

additive property of shannon entropy

ΣH=(H1+H2)=α=1N1pαlogpα+β=1N2pβlogpβ.

In our example, the crow's entropy is H1, and N1=8. The eight outcomes for the crow's three coins have probabilities:

pα{p1,p2,p3,p4,p5,p6,p7,p8} for the outcomes (000,001,010,011,100,101,111), respectively. Similarly, the set of Rat's four possible outcomes pβ{p1,p2,p3,p4}.[1]

short

HT=k=1NTpklogpk, where NT=N1N2, and pk=pαpβ. HT=i=1N1i=jN2(pipk)log(pipk) =i=1N1i=jN2pipk(logpi+logpk)) =i=1N1i=jN2(pipjlogpi+pipjlogpj) =i=jN2pji=1N1pilogpi+i=1N1pij=1N2pjlogpj

Compression image

Students can explore image compression using Microsoft Paint. To a sequence of such images, copy an image into Microsoft Paint and successively reduce it's size by 50% N times. Then return each image back to its original size. In this example a coin from the jpeg Ephesos_620-600_BC.jpg was reduced a total of 6 times.
Students can explore image compression using Microsoft Paint. To a sequence of such images, copy an image into Microsoft Paint and successively reduce it's size by 50% N times. Then return each image back to its original size. In this example a coin from the jpeg Ephesos_620-600_BC.jpg was reduced a total of 6 times.

All images from the same randomized version of file:Image entropy with Inkscape First 01.svg

Template:Multiple image

blurbs

I thought of calling it "information", but the word was overly used, so I decided to call it "uncertainty". [...] Von Neumann told me, "You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage."


001002003004

Rules are needed to keep entropy "simple"

Crows messages are {000,001,010,011,100,101,111}, which corresponds to the integers 0-7 in base. The Crow could send more messages by choosing whether to include leading zeros, and the sole reason for forbidding such signals is to keep the system as simple as possible. For example, we "declare" the single sided coin to be incapable of conveying information because log(1)=0, and not because it is impossible to convey a message by not flipping a coin. Later, this convention that silence never conveys information will lead us to an alternative path to Shannon's formula for entropy.

H is Shannon entropy: H≤S

Shannon entropy H is a generalization of the simple formula, of the simple formula, S=log2Ω, that can be used to adjust for situations where some messages are either never sent, or are less likely to be sent. It involves the probability of certain messages be sent or not sent, and the formula can be used regardless of the mechanism responsible the messages are selected.

The vagueness associated with whether these probabilities are estimated by observation of many signals, or determined by some unknown means does not prevent Shannon's entropy from being useful. For example:

  • Gadsby is a 50,000 word novel that does not contain the letter "e". Any casual analysis leads to the conclusion that the omission of the letter is deliberate.
  • Those engaged in espionage can make inferences about a signal in which the "alphabet" appears to be a uniformly random sequence of letters can is likely to be encrypted.[2]
  • In written English, the letters "q" and "k" have the same pronunciation, which suggests that our text documents could be made smaller by removing the letter "q", for example, spelling "quick" as "quik". This is not much of a concern for text documents, but images and movies can be more easily stored or transmitted if compression is used.

The entropy of Rat's message does not depend on whether the binary system of two coins is used, or whether Rat displays the information using the four sided die shown in Figure 3.

Math

{HH,HT,TH,TT}{00,01,10,11}{one,two,three,four}
 Letting, Template:Math = "heads" and Template:Math= "tails", these 4 outcomes can be described in a number of different ways.  Four such "outcomes" can also be counted in other ways, for example:

{HH,HT,TH,TT}{00,01,10,11}{one,two,three,four}

This is the simplest example of the relationship between the

REWRITE: Figure 1 is currently being used by Wikipedias in five different languages.[3] It illustrates the fact that the value of Shannon Entropy S is the number of coins flipped:

, if the coins are “fair”, i.e. S=N.. The simplest example of an “unfair” coin would be one with either two “head” or two “tails”. Any reasonable measure of information content would ignore that coin and yield an entropy of N1. The scope of informal peek into the entropy is limited:

Figure 2:mThe elephant with 5 coins can send, 25=32, different messages, which is associated with 5 bits of information (if the coins are "fair"). If three coins are given the the crow and two coins to the rat, the crow can send, 23=8, different messages, while the rat can send only, 22=4, different messages. Even though the rat an crow can independently send only 8+4=12 different messages, together, they possess the same number of bits of information, 2+3=5, as the elephant had. This illustrates the additive nature of entropy.

Assume all outcomes are equally probable

I find Crow the more illuminating, in part due to confusions 4=2×2 versus 4=2+2.  
it is often to count not coins but outcomes.  For rat equivalent to four sided coin.  The "alphabet" used to send information is not important.  Coins has advantage and disadvantage.  DELETE

001002003equally probable004P

p1=p2=p3=1P3

p1+p2+p3

Appendix

Reminders

QB

decisions

  1. REFERENCE Wikipedia Permalink section \#Characterization for two rigorous paths to Shannon's formula

Characterization

  • What is the entropy S when one of the “unfair” in that the two outcomes occur with unequal probabilities.[4]
  • How can this definition be extended to include entities of more than two outcomes?[5]
  • What does it mean to say t

Expected number of bits and entropy per symbol

w:Variable-length_code#Uniquely_decodable_codes helped me with w:Shannon's source coding theorem

Number of messages

This has a trivial resolution if we note that the exponential function, y=log2(x), and logarithmic function, x=log2(y), are mutually inverse functions:[6]

y=2x if and only if x=log2(y)( provided y>0.)

For example, the Ω=6 messages associated with the throw of a six-sided die has an entropy of S=log2(6)2.5585

micro/macro state grand ensemble/Gibbs entropy


Since this essay is about the entropy of information,

large numbers

w:Observable_universe#Matter_content—number_of_atoms 10^80

  1. no attempt to explain. This introduction the formula is not rigorous. purpose familiarize properties. slowly build, begin w info result of randon generator flipping N coins, ignoring surpisal. N evolves to shannon entropy.
  2. The elephant shown to right uses 5 coins to send messages. Sometimes he gives three coins to the crow and two coins to the rat so they can also send messages. The number of possible messages available to each animal depends on how many coins are used.
  1. kilobytes is a smallfile
    1. https://en.wikipedia.org/wiki/Byte
    2. https://en.wikipedia.org/wiki/Kilobyte

Boltzmann or Gibbs?


  1. Here denotes that p is an element of a given set. The use α and β helps permits us to estabish that pα=p4 for Crow needs not be numerically equal to pβ=p4 for Rat.
  2. When hiding information, steps are sometimes taken to ensure that the signals are not too random. Criminals selling of illegal drugs might try to make the messages seem innocent by pretending to be making deliveries of pizza and beverages.
  3. 4 wikis
  4. For example, what if one coin is weighted so that, p=1/4 and q=1p=p=1/4?.
  5. For example, a pair of six sided dice is used to generate outcomes.
  6. An easy way to remember this is that if you see y=2x, remember that the logarithm is the exponent.