Archive for September, 2009

HCOL 195 9/28/09

September 29, 2009

I started out by remarking on two of the journals. In one, there’s a nice illustration of how an initial decision may make it difficult to change ones mind, even given more data that casts doubt on the initial decision. This involves the old medical aphorism, “When you hear hoofbeats, think horses, not zebras.” This is a reflection of our Bayesian prior probabilities…at least over here, if you hear hoofbeats, it’s quite likely that it’s a horse that’s producing them, not a zebra (the situation might be reversed in parts of Africa). So the prior probability of “horse” is much bigger than “zebra”. In this case, the student, while a teenager, came down with stomach pains. The doctor was called, who diagnosed (over the phone) a stomach virus (which was going around). A very reasonable diagnosis, given that he hadn’t seen the patient. Horses, not zebras. But the pain didn’t go away, and after a second phone call and finally an office visit, the diagnosis was the same. (Several days had passed, and viruses being self-limiting, this should have been a clue that the initial diagnosis was wrong, but the physician apparently did not think “zebras” at this point.) A day or two later, the pain was still there, and the patient noticed that the stomach was particularly sensitive to being poked in a special place that, the patient had just learned, was where the appendix was. Another call, but it being the weekend, the patient was whisked off to the ER, where a ruptured appendix was diagnosed and surgically removed. This was a very close call! The bottom line for decision-makers is this: Don’t let an initial assessment cloud your future thinking as more data comes in.

The other comment was on another journal. There’s a common mistake in probability that we see from time to time when a person wins the lottery a second time. This usually gets reported in the press and on TV as a very low probability event (i.e., p*p if p is the probability of winning the lottery once). But this is a mistake. p*p is the probability that a particular person, chosen in advance, will win the lottery twice, if he only buys two tickets on two separate lotteries and never buys any other tickets. But that’s not what we have here. This is the probability that someone, sometime, who has already won the lottery, will win it again. And that probability is just p, for any particular lottery winner, if he only buys one ticket ever again.

But the lottery is held frequently. If it is held weekly, for example, in any year there will be 52 lottery winners, and the probability that any of them will win again is 52*p (approximately). The longer the lottery is held, and the more winners there are, the more likely it is that we will have winners who have won more than once.

And that’s not all! Every time a former winner enters, there is a chance that he or she will win again. If the winner enters every week, or buys multiple tickets, the probability that the former winner will win again is multiplied by the total number of tickets bought.

So, for all of these reasons, the probability that someone will win the lottery twice is much, much larger than these sensationalistic press reports suggest.

We then discussed the fish capture-release-recapture problem, with 100 tagged fish being thrown into the lake, and after a time, we catch 10 tagged and 90 untagged fish. As with last time, this gives us a rough estimate of 1000 fish in the lake, since our sampling of 10 tagged out of 100 caught tells us that approximately 10% of the fish in the lake are tagged, and we know that 100 are tagged.

But from a Bayesian point of view we want to set up a spreadsheet as we did before. So we identify the states of nature as being the different numbers of fish that there could be. We know there are at least 190 fish. We put a uniform prior on each state of nature. For simplicity, since we know that when we divide the joint probability by the marginal probability, any multiplicative constant will drop out, we simply enter 1’s in the prior column. The likelihood is a little harder. Suppose we catch first the 10 tagged and then the 90 untagged fish (we’ll discuss the problem of order in a moment). The probability of catching the first tagged fish is 100/N, if N is the number in the lake. The probability of catching the second tagged fish is 99/(N-1), of the third, 98/(N-2), and so on to the last tagged fish caught, which is 91/(N-9). The probability of catching the first untagged fish is (N-100)/(N-10), of the second (N-101)/(N-11), and so on to the last of the 90 untagged fish, (N-189)/(N-99). The probability of catching them all in this order is the product of these probabilities.

But we probably caught the fish in a different order. No worries! If you write the whole probability down, you’ll see that all this means is that you’ll switch various numbers in the numerator around, but the fraction will remain the same. Or, if you don’t know the order, you’ll multiply by the appropriate binomial coefficient (the number of ways of picking 90 untagged and 10 tagged out of N total fish), but that factor will be the same for each probability so will cancel out when we divide.

The calculation is shown here (photo of whiteboard):

Spreadsheet for Fish

Spreadsheet for Fish

We thought about what the graph of this would look like. One student suggested that it would tail off and get smaller and smaller as the number of fish increased. Another pointed out (in response to my question) that the maximum should be around or at 1000. But that means that it should increase from 190 up to 1000, as in the picture:

 

http://www.nytimes.com/2009/09/29/science/29chaos.html

Graph of posterior probability for Fish

So we can now do the same things that we did for cure rates. By adding (or computing areas under the curve) we can estimate, for example, the probability that the number of fish is between 500 and 1500, or any similar pair of numbers, or we can calculate a range for which the probability of the number of fish being in that range is 0.95 (which is the standard 2 standard deviations criterion). One student asked if this is the same as what we would get by calculating with the usual formula. The usual formula works for normals, but not for this skewed distribution. But it would still be approximately right.

I then changed the subject and asked, what is the value of a human life? Yes, it’s a horrible question, but it’s one that we are forced as a society to answer, since many policy questions depend on the answer, such as, is it cost-effective to require everyone to have seat belts, or health insurance, etc. A number of approaches could be used: How much life insurance should someone with dependents take out, or how much might one expect to earn in a lifetime, or the economic loss to society if someone dies prematurely. For such government questions, it’s probably best to err on the high side, and although the class voted 1-2, maybe 3 million dollars, government agencies generally use numbers in the 4-7 million dollar range.

We discussed situations like the Terry Schaivo case, where a young woman was in a persistent vegetative state, and the family couldn’t decide whether to withdraw artificial life support. The case went to Congress, which passed a special law, which in turn was declared unconstitutional. Eventually, the husband prevailed and she was taken off the life support, and died a few weeks later. I urged everyone to consider legal instruments that will make it clear what you want to happen should you be in such a situation. They are: A Living Will, and a Durable Power of Attorney for Health Care. The first of these states your wishes; the second allows someone you trust and choose to make decisions about health care in case you are unable to do so.

It’s hard to make up random numbers!

September 27, 2009

This is a followup on our experiment with generating heads and tails, from the first assignment. The author of this blog suspects that a firm specializing in political polls is actually making them up, based on statistical anomalies in the numbers they report. He mentions our experiment as an example.

Interesting article

September 27, 2009

At the beginning of the course we took a “test” that contained questions, some of which had a different wording but basically the same result, and noted that the way a question was worded often affected the way people responded to the question. This is one of the discoveries of “behavioral economics.”

The New York Times today has an article that discusses how behavioral economics might be used to increase the number of organs that are donated for transplant in patients who need them.

HCOL 195 9/25/09

September 26, 2009

We agreed upon Wednesday, October 14, as the date of the first quiz.

Reading: Continue reading Flip, Chapters 17-19 and 21. Continue reading Calculated Risks, Chapters 8-10.

We went over the homework. I had hoped to get to the capture-release-recapture method for estimating the population of fish in a lake, but we didn’t have time. So, we’ll continue that discussion on Monday. Please continue thinking about this so that we can have a good discussion.

Every group did the Super Growth Stock problem just fine, although one group wasn’t sure they’d done it right (they thought the answer should have been much larger) and mentioned this in their answer. Let me remind everyone, if you aren’t sure, or if your group can’t agree on one answer, the right thing to do is to bring the problem to my attention by writing it down.

I remarked that in general, it’s best to use decimal fractions instead of built-up fractions; they are easier to work with (especially if you use a calculator).

On the Colon Cancer problem, the best way to explain things to a patient is to use the “natural frequencies” method, as it can be explained without tables or graphs. So out of 10,000 patients, 3 in 1000, or 30, have cancer. The remaining 9970 do not. Of those that have cancer, the hemoccult test will detect half, or 15. Of the 9970 that do not, the hemoccult test will give 3%, or 299, a false positive. Therefore, only 3 out of the 302 that test positive have cancer, less than 1%.

The Cookies problem has two parts; some groups didn’t do the first part, thinking that the probablility is 50-50 after the first four cookies have been eaten that the box in hand is the chocolate chip (CC) only box. That’s not right, because if you’ve sampled four cookies at random, and all of them are CC, then that’s pretty strong evidence that the box contains only CC’s. We therefore need to compute the posterior probability that we have each box, given that we’ve sampled (and eaten) four CC cookies. This is a problem of sampling without replacement, since the cookies don’t go back in the box, but are eaten. So the probability of picking successively four CC cookies out of the CC box is obviously 1, and out of the other box, (5/10)x(4/9)x(3/8)x(2/7), since these are the conditional probabilities of sampling a CC cookie from a box that contains successively fewer CC cookies and total number of cookies.

I photographed the two blackboards with the calculation:

Spreadsheet calculation for posterior probability

Spreadsheet calculation for posterior probability

Calculation of the probability of a CC cookie

Calculation of the probability of a CC cookie

For the Biased Presidents calculation, its easiest to just list the eight possibilities and their probabilities:

PDN
——–

HHH 0.4×0.7×0.5=0.14
HHT 0.4×0.7×0.5=0.14
HTH 0.4×0.3×0.5=0.06
HTT 0.4×0.3×0.5=0.06

THH 0.6×0.7×0.5=0.21
THT 0.6×0.7×0.5=0.21
TTH 0.6×0.3×0.5=0.09
TTT 0.6×0.3×0.5=0.09

We verify that these add up to 1, as they should. Then: The probability of getting three heads is just the first line, or 0.14. Lines 2, 3 and 5 added together give the probability of seeing exactly two heads. That’s 0.41. For the third question, we use the definition of conditional probability: P(A|B)=P(A,B)/P(B). So if A=”see three presidents and at least one president is showing” (which is the same as “see three presidents” and B=”at least one president is showing” we see that P(A,B) is just 0.14 from the first line, and P(B) is the sum of the first seven lines, of 0.91, so P(A|B)=0.14/0.91=0.15. and if C=”Lincoln is showing” then P(A,C)=0.14 as before, and P(C)=the sum of the first four lines = 0.40 so P(A|C)=0.14/0.40=0.35.

On Independence, the goal here was to start with what’s given and use the rules of probability to derive the “then” part of the statement. So, we can prove #1 by starting with Bayes’ theorem in the form P(A|B)P(B)=P(B|A)P(A), make the substitution into this of P(A|B)=P(A) and cancel the P(A) on both sides to get P(B)=P(B|A). For #2, use the definition of P(A,B)=P(A|B)P(B) and again substitute. I leave it to you to do #3, the proof is equally short.

On More Independence, the table I gave you has two sets of marginals: Summing across we get 0.4, 0.6. Summing down we get 0.3, 0.6 and 0.1. But now the easiest way to check independence is to see if each entry in the table is the product of the corresponding marginals. So, for example, the upper-lefthand entry is 0.12=0.4×0.3, and similarly throughout the table. Since each joint probability is the product of the two marginals, criterion #3 from the previous problem shows that this table is independent. For any set of marginals, there is one and only one independent table of joint probabilities, but there are infinitely many dependent tables. We added and subtracted 1 from two columns and two rows to accomplish this, leaving the marginals unchanges.

HCOL 195 9/23/09

September 23, 2009

We started with some announcements. In particular, I mentioned the William Lowell Putnam exam, which will be held in December. If you are interested in taking this exam, please see the math department (16 Colchester Ave.) immediately. The secretary in the math office can tell you the person you need to talk to. The application forms are due in California in about two weeks, so don’t delay.

I passed out the next assignment and remarked that the first two problems are different, in that the first one has you eat the chocolate as you pick them out, whereas in the second, the machine is producing an essentially infinite supply of widgets. In the third problem, the procedure is basically the same as what we did on Monday in class, except that you will use an electronic spreadsheet so that you can divide the x-axis into 100 rather than 10 divisions. The fourth problem shows that the methods we are using are applicable to many fields, in this case, literary analysis to credit an author. I pointed out that this problem was inspired by the problem of attributing several (about 10) of the anonymously written Federalist Papers to their actual author (Hamilton or Madison).

I had redrawn the pictures from our last class (see blog below). We attempted to use them to estimate one standard deviation. Our estimate was that 2/3 of the probabililty was contained between 0.20 and 0.47. For your interest, I calculated this with a spreadsheet with divisions 1/10 of what we used in class. Our estimate was spot on. We also considered the question of what the probability is that the drug is more effective than the standard drug, which we said had a cure rate of 0.2. By adding up the probability in the intervals from 0.2 to 1.0, we found that that probability adds up to about 0.84.

I finished the class by posing the problem of counting the fish in a lake. One student pointed out that when you want to count deer on land, you count all the deer in some sample areas, then extrapolate to the entire area under study. Theoretically this might work, although in a lake it might be difficult to count all the fish in some volume of water. So I introduced another approach, the “capture-release-recapture” method. We catch a sample of fish (say 100), tag them all, and return them to the lake (ideally we will sample the whole lake to prevent oversampling in an atypical region). We allow some time to pass, then go back and capture another sample (we said 100), and count the tagged fish (10). A rough estimate of the number of fish was correctly stated by a student to be about 1000 fish, based on the idea that we must have tagged about 10% of the fish if we had 10 tagged amongst the 100, so if there are 100 tagged fish in the lake there should be about 1000 fish in the lake.

We noted that there have to be at least 190 fish in the lake, since we tagged 100 and counted 90 untagged in the second catch. In setting up a spreadsheet calculation for a Bayesian calculation here, we decided that the states of nature (SON) should go from 191 to a million; actually I realized that it should start at 190. An uninformative prior would put the same on each SON, so we can just write 1 for each, since we know that we don’t have to make the prior add up to 1. We’ll look at this more on Friday. I asked you to think about how we should determine the likelihood, defined as the probability of obtaining the data we did, given each SON. The data are: We tagged 100 fish, and on the recapture phase caught 100 fish, 10 of which were tagged.

HCOL 195 9/21/09

September 22, 2009

This morning we went over something I’d missed in last week’s homework. I had neglected to answer the “three children, one king” question. There’s only one way (BBB) for the king to have two brothers, and 7 ways to get a king, sp that’s 1/7. There are three ways to have one brother (BBG, BGB, GBB) so that’s 3/7. And the remaining three possibilities have no brothers for the king, so that’s also 3/7.

I then brought up the last Monty Hall problem, “Ignorant Monty”. Here, Monty doesn’t know where the prize is, so can open a door with the prize. But in this case you chose door #1, Monty randomly opened door #2 and you saw a goat. Does it matter if you switch or not? The class was divided, some thinking it would matter (improve your odds) and some not. So we did a spreadsheet calculation. Since we know how to do this, the key understanding is the likelihood: If the prize is behind door #1 (your chosen door), then whichever door Monty opens, you’ll see a goat. But there’s a 50% chance he’ll open door #2, so the probability of observing that data, given that the prize is behind door #1, is 0.5. Now, if the prize is behind door #2, then you’ll see a prize if he opens #2 so the probability that you’ll see a goat is 0. Finally, if the prize is behind door #3, you’ll certainly see a goat if he opens door #2, and he does this half the time, so the probability of observing the data we observed, given that the prize is behind door #3, is also 1/2. So, the likelihood column is (1/2, 0,1/2), and the posterior probabilitites that the prize is behind each door is (1/2, 0, 1/2). So it doesn’t matter if you switch.

We then talked about a very common problem: Estimating a rate from observations (e.g., what proportion of voters will vote for a particular candidate, the cure rate of a new drug, etc.) After some discussion, we decided that the cure rate of a new drug would be represented by a number between 0 and 1, i.e., 0≤r≤1. There are infinitely many possibilities. But we can’t draw a spreadsheet with infinitely many possibilities, so we settled on 10. I pointed out that you can get better resolution with more points, so for example if you used 100 possibilities, you’d probably be able to get adequate accuracy with up to 10,000 observed subjects, the square of 100. I also pointed out that in modern Bayesian research, we usually use approximate methods to get our answers (I’ll give more information on this at a later time).

So we used the 10 points (0.05, 0.15, 0.25,…,0.95) as the values of r at which we’d evaluate our spreadsheet. I listed them on the blackboard, and we assigned (for the moment) values of 1/10 all the way down for the prior. We then thought about the likelihood: The probability of observing m cures and n non-cures in a population of (m+n) patients. We decided on 3 cures and 7 not cured. We saw that because the observations are independent, the probability of observing a particular sequence of cures and not cureds, for each given rate, is r3(1-r)7. Each cured patient contributes a factor of r, and each patient not cured contributes a factor of (1-r).

I pointed out that if we multiply the prior column by any factor (say 10), that will multiply the joint column also by the same factor. Also, the marginal at the bottom of that column would get multiplied by the same factor, so that when you divide the joing column by the marginal, the factor will cancel out in the posterior column. So, we replaced the 1/10’s in the prior column by 1’s, which made the calculation of the joint equal to the likelihood. The resulting calculation is shown in the picture I took of the whiteboard before I erased it:

Cure Rate Calculation

Cure Rate Calculation

I then drew a plot of this on the right side and “connected the dots” with a smooth curve, which in fact represents what we would have gotten had we used many more values of r in the spreadsheet. I then put boxes over each data point and pointed out that the area of the boxes is equivalent to integrating the curve approximately, i.e., it’s an approximation to the area under the curve. (I’ll have more to say on this tomorrow).

Statisticians win $1,000,000 from Netflix

September 21, 2009

Here is an interesting article on how a team of statisticians, computer scientists and others today were awarded a $1,000,000 check for developing a program to better predict users’ preferences and suggest movies to them.

And here’s a followup article.

HCOL 195 9/18/09

September 19, 2009

The first thing I did today was to correct an error in the spreadsheet for the “Mixture Monty” problem. The SON H,D1 has likelihood 1/2, because if the contestant picks door #1 then Monty From Hell has two doors that he can open to show a goat, equal probabilities for each. We completed the spreadsheet and found that with Mixture Monty, contrary to our expectations, it is advantageous to switch, because the posterior probability is 2/3 that the door you didn’t pick has the prize, and only 1/3 that the door you picked has the prize.

I also did a simplified version of the N doors problem, where you pick a door and Monty opens all but 1 of the remaining doors, revealing all goats (he can do this because he knows where the prize is). Suppose you pick Door #1. Then, if the prize is behind Door #1, Monty can choose any one of the N-1 remaining doors to keep shut, since all of them have goats. So the probability that he opens a particular door k given that the prize is behind your chosen door is 1/(N-1). If the prize is behind Door #2 (which you didn’t choose), then he must open all remaining (N-2) doors, so he cannot leave door k>2 unopened, and the probability that he does is 0. Indeed, that must be the case for every door except for Door k. So, if you did not choose the right door at the outset, Monty is forced to keep the door that has the prize closed, and must open all the remaining doors. So, the probability that he keeps door k closed, given that the prize is behind door k, is 1, and all the other likelihoods (except for the door you chose) are 0. So the problem is actually quite easy from this point on as almost all the numbers we calculate are 0. We calculated the marginal probability as 1/(N-1), the posterior probability that our door is the right one as 1/N, and the posterior probability that door #k has the prize as (N-1)/N.

Then we looked at the problems. Most guesses were that the probability was 1/2 that the sibling is a boy. But the simulation showed that in twice as many of the tosses that had at least one head, the other toss was a tail (so, a girl), than the other toss being a head. One group did not keep the two tosses in each set together and so did not get reasonable results. The intent was for you to write down something like: First pair of tosses (H,T), Second pair (T,T), Third (H,T), Fourth (T,H), etc. There were several trees drawn. One of them had (K,Q) at the base of the probability tree. That’s not right, since you don’t know if there will even be a king or a queen until both coins are tossed. The random events are the coin tosses, so the tree should have at its base the outcome of the first coin toss (H1,T1) with probabilities 1/2 for P(H1) and 1/2 for P(T1). Then each of these branches has a pair of branches representing the second toss (H2,T2) again, and the conditional probabilities P(H2|H1)=P(T2|H1)=1/2, and similarly for the T1 branch. I remarked that since P(H2|H1)=P(H2) for example, the second toss is actually independent of the first. This is our first example of independence, which will be very important.

Finally I noted that the four outcomes are HH, HT, TH and TT (I displayed these in a table), and by counting, we see that there are twice as many cases that have at least one H and have a T, than cases where both are H.

On the dice problem, the easiest way is to draw a table with 36 boxes, in a 6×6 arrangement. Label the top and left with the number of spots showing (1-6) and then put the sum of the number at the top and the number on the left in the corresponding box. I noted that the number of boxes containing a ‘7’ is greater than for any other number, which is why ‘7’ is an important point in craps. We can just circle and count. We see five cases where there is an ‘8’ and circle them. Since there are 36 possibilities, the probability of getting an ‘8’ is 5/36. We then circled the cases where one of the dice is a ‘5’. This is the row labeled ‘5’ and the column labeled ‘5’, a total of 11 cases. Of these, two have a sum of eight, so the probability of rolling an eight, given that one of the dice is ‘5’, is 2/11. On the other hand, if we look at the five cases where the total is eight, two of them have a ‘5’, so the probability of seeing a ‘5’, given that you have rolled eight, is 2/5.

On the funny dice problem, all the probabilities are 2/3: P(B beats A)=2/3, P(C beats B)=2/3, P(D beats C)=2/3, P(A beats D)=2/3. There was a disagreement on this in one group, which wisely wrote down that fact and gave both methods of getting the result.

Finally, the answers to the last question are 1/8, 3/8, 1/7, 1/4.

HCOL 195 9/16/09

September 17, 2009

I showed everyone my two-headed and two-tailed coins and remarked that they might be a better way to present the RR, RB and BB card experiment.

I asked how people had solved the Monty Hall problem; two people drew trees. I then drew my probability tree, an “natural frequencies” tree (the portion that we needed), and displayed a spreadsheet calculation.

I then attempted, on the spur of the moment, to indicate the spreadsheet calculation for a problem where there are 1,000,000 doors and Monty opens all but two (again, he knows where the prize is, and does not open that door nor the door you chose), revealing 999,998 goats. Unfortunately, since I hadn’t thought of this before and my calculation was spur-of-the-moment, I wasn’t able to complete it. After class I looked at it again and it’s pretty simple; I’ll display it on Friday.

I asked about “Angelic Monty” and “Monty from Hell.” In Angelic Monty, if you pick the right door, Monty opens the door and shows you that you won. If you pick the wrong door, he opens a door with a goat and encourages you to switch (and if you do, you’ll get the prize). Everyone agreed it’s best to switch if you know that Angelic Monty is in charge. On the other hand, Monty from Hell will open the door you picked if you guessed wrong and show you a goat; but if you guessed right, he will open another goat and offer you the chance to switch. In this case, everyone agreed that if you know you have Monty from Hell, you should not switch.

I then mentioned “Mixture Monty.” In Mixture Monty, backstage before the show Monty flips a fair coin. If it’s heads, he behaves like Angelic Monty. If it’s tails, he behaves like Monty from Hell. The question is, if you face Mixture Monty, should you switch, or doesn’t it make any difference? Everyone agreed that it’s 50:50 that the prize is behind your door, so it doesn’t make any difference. I started a spreadsheet with six states of nature, 3 doors for the prize if it’s Angelic Monty, and 3 if it’s Monty from Hell. We put a prior in, and wrote down the likelihood column. However, we were rushed for time, and there was a mistake in that column (and we didn’t finish the calculation). After class, a student found the error. We’ll pick up at that point on Friday.

HCOL 195 9/14/09

September 14, 2009

Several housekeeping items:

1) I received word from the bookstore today that the paperback edition of “Predictably Irrational” won’t come out until next year. Since I am unwilling to make you pay for hardback prices, we will just not read that book, and I will introduce topics from it in my lectures alone.

2) New reading. You should have read Chapters 1-5 of “Why Flip a Coin” by now, along with Part I of “Calculated Risks.” Please continue your reading by reading Chapters 6-9 of “Why Flip a Coin,” and Chapters 5-6 of “Calculated Risks.”

Summary of today’s class:

I proposed a problem with three cards. One has two red sides, one has a red and a blue side, one has two blue sides. You pick a card at random with one “up” side that is random. Suppose the “up” side that you see is red. What is the probability that the other side is red?

Everyone guessed that it would be 50%. But that turns out to be wrong. We drew a probability tree, and calculated that P(RR)=1/3, P(RB)=1/3, P(BB)=1/3. Also, P(see R|RR)=1 and P(see R|RB)=1/2. So by multiplying (conditional probability formula), we get P(see R, RR)=1/3, P(see R,RB)=1/6. (The other cases are when we counterfactually see a blue side, and you can always ignore cases that were not observed.) The total probability that you see red is the sum of these: P(see R)=P(see R,RR)+P(see R,RB)=1/3+1/6=1/2. That makes sense, because the problem is completely symmetrical between seeing red and seeing blue. But then the conditional probability formula says that P(RR|see R)=(1/3)/(1/2)=2/3, so the probability that the other side is red is 2/3.

But there was still a concern that there were two cards, one with red on the other side and the other with blue. So, it seems that the probability is 50% that the other side is red.

So I imagined a similar coin experiment, with a HH coin, a TT coin and an HT coin. If we chose a coin at random and tossed it, and did this 30 times, we’d expect to get the HH coin 10 times and see 10 heads; we’d expect to get the HT coin 10 times and see 5 heads. We’d see no heads in the 10 times we’d expect to get the TT coin. But we see from this example that of the times we see a head, in 10 of those cases we’d have the HH coin, and the other side would be H. But in only 5 of the HT cases would we see a head. So in 2/3 of the cases where we see a head, the other side of the coin is a head.

We then did it with a spreadsheet-style calculation. Column 1 is the states of nature: RR, RB, BB. Column 2 the prior probabilities of the three states (in this case the probability that we picked that card): 1/3, 1/3, 1/3. Column 3 has the likelihoods (the probability of observing R, given that the particular state of nature in Column 1 is the case: 1, 1/2, 0. By multiplying Column 2 by Column 3 we get the joint probabilities in Column 4, the probabilities of observing both R and a particular state of nature: 1/3, 1/6, 0. We then add the joint probabilities to get the marginal probability, the probability of observing R: 1/3+1/6=1/2. This gets divided into each line in Column 4 to get Column 5, the posterior probability of observing each state of nature, given that we observed R: 2/3, 1/3, 0.

I mentioned that all of these methods give the right answer, and in response to a question, on a test or on homework, everyone should explain in detail how they got their result, no matter what method is used. (Aside: If you do it two ways, e.g., Natural Frequencies as well as Spreadsheet, and get the same answer, you can be more confident that you’ve done it right).

I used a table format to put down a joint probability table for P(state of nature, color seen). I showed how we can sum down the table to get the probability that we observe a given color, regardless of the state of nature. Since we write these numbers in the bottom margin of the table, we call these “marginal probabilities.” Similarly, if we sum across the table, we get the probabilities of the states of nature, regardless of the color seen. These are also marginal probabilities.

I then described the “Monty Hall Problem,” based loosely on the former TV game, “Let’s Make a Deal.” In this game, the contestant is faced with three doors, behind which are various prizes. One of them is a great prize, the others not so great. (Traditionally, a car and two goats). You pick a door, and Monty may (in the real show) give you the prize, or may offer you a chance to switch, or may open a door containing a lesser prize and offer you a chance to switch.

The “Problem” has different rules from the show. In this problem, Monty knows where the prize is, and after you choose a door, always opens another door with a goat, and always offers you a chance to switch. The question is, is it advantageous to switch, if your aim is to get the great prize (“car”)?

Most in the class thought it would be advantageous to switch, but the reasoning seemed not quite right. Some thought that Monty’s opening the door changed the probability that the prize is behind the door you chose from 1/3 to 1/2, with 1/2 as the probability that the prize is behind the other door. It would seem that if this is the case, it doesn’t matter whether you switch or not, the probability of getting the car is 1/2 if you switch and 1/2 if you stick. So I was a bit puzzled why this was being said. Maybe it was because the probability of the prize being behind the other door was thought to be 1/2, which is more than 1/3. But still, the probabilities have to add up to 1, and they don’t because 1/3+1/2 isn’t equal to 1.

So I went to the million door problem. There are a million doors, one of which has a car, and the rest booby prizes. You choose a door. Everyone agrees that the probability that you got the right door is 1 in a million. I then (knowing where the prize is) open 999,998 doors, none of which has the car. Since I always open a door that I know doesn’t have the car, does this change the probability that you initially chose the door with the car? Evidently not. So, as I open door after door, the probability that one of the other doors has the car also remains at 999,999/1,000,000. All I am doing is eliminating doors that don’t have the prize, but that doesn’t affect the probability that you guessed right (or wrong) initially. So after I open all the doors, there is still a chance of 1 in a million that the door you chose has the prize, and a chance of 999,999 in a million that the other door has the prize.

I left you with the problem of showing (in the original Monty Hall Problem) that the probability that the prize is behind the door you didn’t choose is 2/3 after Monty shows you the goat, using one of the methods we have discussed (natural frequencies, probability tree, spreadsheet calculation). We’ll discuss your solutions on Wednesday.