HCOL 196, February 7, 2011

We finished the drug cure rate spreadsheet from last time. We determined that the probability that an individual would be cured is r, whereas the probability that an individual would not be cured is (1-r). There will be one term in the product for each individual in the trial, so the likelihood (for each SON r) is r3(1-r)7, since three were cured and 7 were not. We calculated these in the spreadsheet (but stopped calculating for larger values of r since the likelihood became very small; the error is negligible). We then computed the joint and the marginal (though the marginal was a bit too big, as we learned later when one student pointed out that the posterior wasn’t adding to 1). A corrected spreadsheet from my computer is shown later.

Then we graphed the result, remembering that the vertical scale isn’t quite correct because the marginal we calculated wasn’t right.

The graph looks similar to a bell curve, but not exactly…the standard normal distribution goes all the way to infinity in both directions, whereas this one stops at o and 1. Also, the standard normal distribution is symmetrical…you can flip it left to right and it will be the same. The curve we graphed isn’t symmetrical. The curve is known as a beta distribution, and lots of things are known about it.

The peak of the curve is about 0.3, which is the proportion of people in the trial that were cured. But the curve gives us much more, because if you look at the area under the curve, the proportion of the area between some point a on the left and another point b on the right is the probability that r is between a and b (the probability that a≤r≤b). So we get a lot more out of this than just the most likely value of r, we get information that tells us where r is most likely to be found.

I created a spreadsheet and made a screenshot of what the spreadsheet should have been. The marginal in our whiteboard is about 50% higher than the correct marginal. I got Excel to produce a corrected plot of the posterior. (No nice smooth line, though, you’ll have to imagine that!)

I also noted that the sum in the marginal can be rewritten by dividing and multiplying by the interval width, here 0.1, and then explicitly writing the interval width as $\Delta x$, at which point the sum looks like the sums that we use when defining integrals in the calculus. This makes the connection between these sums and integral calculus, which as you will recall, defines the area under curves!

See you on Wednesday!