## HCOL 195 9/9/09

We looked at the probability tree from last time; I showed how the conditional probability formula P(A,B)=P(A|B)P(B) can be read off the tree; by switching letters we get P(B,A)=P(P|A)P(A), and then noticing that P(A,B)=P(B,A) we get Bayes’ theorem in the form P(A|B)P(B)=P(B|A)P(A). I remarked that everything we are going to study derives from this.

I identified several pieces of Bayes’ theorem, written in the usual form: P(A|B)=P(B|A)P(A)/P(B). The term P(A) is the prior probability, often called just the prior. It tells use what we think about the state of nature A before we observe the data B. The term P(A|B) is called the posterior probability, or posterior. It tells us what we think about the state of nature A after we observe the data B. The term P(B|A) is the likelihood. It tells us how much the data B supports the hypothesis that the state of nature A is the true state of nature. And the divisor P(B) is the probability of observing the data B, sometimes called the marginal likelihood. The quantity P(A,B) is the joint probability of A and B.

I demonstrated a fourth way to display a calculation; the table (spreadsheet) method. This is very useful when there are many alternative states of nature. We do this by displaying in the table in column 1, the states of nature. In column 2, the priors on each state of nature (should add to 1, although there are circumstances when we can ignore this and just use relative numbers…I’ll talk about that some other time). In column 3, the likelihood. In column 4, the corresponding joint probabilities (just multiply column 2 term by term by column 3 and put the products in column 4. At the bottom of column 4 we put the sum of that column; this is the marginal likelihood. Finally, column 5 gets the posterior probabilities of each state of nature. We divide each term in column 4 by the marginal likelihood at the bottom. By construction, column 5 ought to add up to 1, and you should use that as a check to make sure you haven’t made a mistake. This method is easily automated in an Excel spreadsheet.

I described the “three envelopes” problem, where there are three envelopes each containing unknown and different amounts of money. You are to devise a strategy for maximizing the probability of getting the largest amount. You can pick any envelope at random, open it, and either keep the money or discard it and pick another envelope. Again, you can look at the money in that envelope and either keep it or discard it. In this case you’ll get the amount in the third envelope, whatever amount it is. (The rules here are similar to “The Dating Game” in Flip).

Discussed strategy in three envelopes problem (related to dating game). Ideal strategy gives 1/2 of getting envelope with largest amount. This consists of looking at the content of one envelope, noting the amount, and discarding it. Then we look in the second envelope. If it contains more than the first, keep that money and stop the game. Otherwise, discard that amount and take whatever is in the third envelope.

Everyone agreed that if first envelope had \$1M, they’d take it, regardless of the strategy. That’s so much money that everyone would be happy to walk away with it.