Joint and conditional probability

From testwiki
Jump to navigation Jump to search

Suppose that outcome can be either of events A or B (but never both) with probabilities 0.4 and 0.6 correspondingly in case event X happens. If mutually disjoint to X, event Y occurs instead then probabilites of A-B distribute evenly, like .5 and .5. These data can be summarized in a Markov matrix:

XYAP(A|X)P(A|Y)BP(B|X)P(B|Y)=XYA.4.5B.6.5

Here, P(A|X) stands for probability of event A provided that X has occurred. A|X generally denotes a conditional probability of event A under condition X.

Note that the sum of columns adds up to 1 since their entries represent mutually exclusive events.

Now, suppose that X can occur with probability .8 and Y has probability of .2. We multiply the first unit/column with .8 and second with .2 so that the joint distribution breaks down into

1 * .8 + 1 * .2 = 1 * (.8 +.2) = 1 * 1 = 1 = 1 * .8 + 1 * .2 = (.4 + .6) * .8 + (.5+.5)/5 = (.32 + .48) + (.1 + .1)

where first parenthesis is a sum of event probabilities under X and .1 + .1 are probabilities under event Y

This can be represented again by matrix again

X|P(X)Y|P(Y)AP(AX)P(AY)BP(BX)P(BY)=X|P(X)Y|P(Y)AP(XA)P(YA)BP(XB)P(YB)=X|.8Y|.2A.32.1B.48.1

Note that columns now add up to .8 and .2 correpsondingly whereas all table adds up to .8+.2 = 1. We have got a 2-dimensional distribution of probability. In every cell we have the joint probability of pair of events occurring, e.g. P(A∩X) = .32. The probability of conjunction A∩X is less than the probability of components (A|X) and X alone because probability under X, probability of every column, added up to 1 in the conditional probability table but it adds to P(X) ≤ 1 in the joint distribution table.

This fact, that column P(A1Xi)+P(A2Xi)+=P(Xi) adds up to marginal probability of the column Xi, that is a probability that randomly drawn event ends up in the column i, enables us to recover the conditional probabilities. We just need to divide every P(AjXi) in the column i by P(Xi):

[P(A|X)P(B|X)]=[P(AX)P(BX)]1P(X)=[.32.48]1.8=[.4.6]

The relationship

P(AX)=P(X)P(A|X)

is a basis for famous w:Bayes' theorem P(X)P(A|X)=P(A)P(X|A) because we can symmetrically condition the probabilities within the rows by probabilities of observing the rows:

[(.32+.1)/pa(.48+.1)/pb]=[11]=[(.32+.1)/.42(.48+.1)/.58]=[.76+.24.83+.17]=[P(X|A)+P(Y|A)P(X|B)+P(Y|B)]

That is, conditional probability P(X|A) = .76.