HCOL 195 9/25/09

We agreed upon Wednesday, October 14, as the date of the first quiz.

We went over the homework. I had hoped to get to the capture-release-recapture method for estimating the population of fish in a lake, but we didn’t have time. So, we’ll continue that discussion on Monday. Please continue thinking about this so that we can have a good discussion.

Every group did the Super Growth Stock problem just fine, although one group wasn’t sure they’d done it right (they thought the answer should have been much larger) and mentioned this in their answer. Let me remind everyone, if you aren’t sure, or if your group can’t agree on one answer, the right thing to do is to bring the problem to my attention by writing it down.

I remarked that in general, it’s best to use decimal fractions instead of built-up fractions; they are easier to work with (especially if you use a calculator).

On the Colon Cancer problem, the best way to explain things to a patient is to use the “natural frequencies” method, as it can be explained without tables or graphs. So out of 10,000 patients, 3 in 1000, or 30, have cancer. The remaining 9970 do not. Of those that have cancer, the hemoccult test will detect half, or 15. Of the 9970 that do not, the hemoccult test will give 3%, or 299, a false positive. Therefore, only 3 out of the 302 that test positive have cancer, less than 1%.

The Cookies problem has two parts; some groups didn’t do the first part, thinking that the probablility is 50-50 after the first four cookies have been eaten that the box in hand is the chocolate chip (CC) only box. That’s not right, because if you’ve sampled four cookies at random, and all of them are CC, then that’s pretty strong evidence that the box contains only CC’s. We therefore need to compute the posterior probability that we have each box, given that we’ve sampled (and eaten) four CC cookies. This is a problem of sampling without replacement, since the cookies don’t go back in the box, but are eaten. So the probability of picking successively four CC cookies out of the CC box is obviously 1, and out of the other box, (5/10)x(4/9)x(3/8)x(2/7), since these are the conditional probabilities of sampling a CC cookie from a box that contains successively fewer CC cookies and total number of cookies.

I photographed the two blackboards with the calculation:

Calculation of the probability of a CC cookie

For the Biased Presidents calculation, its easiest to just list the eight possibilities and their probabilities:

PDN
——–

HHH 0.4×0.7×0.5=0.14
HHT 0.4×0.7×0.5=0.14
HTH 0.4×0.3×0.5=0.06
HTT 0.4×0.3×0.5=0.06

THH 0.6×0.7×0.5=0.21
THT 0.6×0.7×0.5=0.21
TTH 0.6×0.3×0.5=0.09
TTT 0.6×0.3×0.5=0.09

We verify that these add up to 1, as they should. Then: The probability of getting three heads is just the first line, or 0.14. Lines 2, 3 and 5 added together give the probability of seeing exactly two heads. That’s 0.41. For the third question, we use the definition of conditional probability: P(A|B)=P(A,B)/P(B). So if A=”see three presidents and at least one president is showing” (which is the same as “see three presidents” and B=”at least one president is showing” we see that P(A,B) is just 0.14 from the first line, and P(B) is the sum of the first seven lines, of 0.91, so P(A|B)=0.14/0.91=0.15. and if C=”Lincoln is showing” then P(A,C)=0.14 as before, and P(C)=the sum of the first four lines = 0.40 so P(A|C)=0.14/0.40=0.35.

On Independence, the goal here was to start with what’s given and use the rules of probability to derive the “then” part of the statement. So, we can prove #1 by starting with Bayes’ theorem in the form P(A|B)P(B)=P(B|A)P(A), make the substitution into this of P(A|B)=P(A) and cancel the P(A) on both sides to get P(B)=P(B|A). For #2, use the definition of P(A,B)=P(A|B)P(B) and again substitute. I leave it to you to do #3, the proof is equally short.

On More Independence, the table I gave you has two sets of marginals: Summing across we get 0.4, 0.6. Summing down we get 0.3, 0.6 and 0.1. But now the easiest way to check independence is to see if each entry in the table is the product of the corresponding marginals. So, for example, the upper-lefthand entry is 0.12=0.4×0.3, and similarly throughout the table. Since each joint probability is the product of the two marginals, criterion #3 from the previous problem shows that this table is independent. For any set of marginals, there is one and only one independent table of joint probabilities, but there are infinitely many dependent tables. We added and subtracted 1 from two columns and two rows to accomplish this, leaving the marginals unchanges.