I wanted to expand on the polling example. The way we did it in class, it was frequentist. How would it be done in a Bayesian way? There are two quantities we don’t know: The number of heads rolled, and the proportion of people who did drugs. We could, for example, divide the proportion of people who did drugs up into 10 or 100 discrete intervals, represented by the center of the interval. In the drawing, I’ve done it with 100 intervals centered on 0.005, 0.015,…,0.995. That can have a uniform prior. But we also don’t know the number of heads thrown, so that is also part of the SON. For a prior on that we can assume that the coin is fair, and the appropriate distribution is binomial. See the left side of the picture (sorry, a little of it got cut off).

The “choose” function is shown in the next picture. Notice that the only thing in the prior that actually varies with n is the “choose” function. The rest of the prior on n can be ignored, because we will be dividing by the marginal and any constants will cancel.

The likelihood will be zero if the number of heads is greater than 57, since we know that at least 43 tails were thrown (otherwise we could not have gotten 43 “no” responses). The “indicator” function I(n≤57) is 1 if the thing inside is true for a particular value of n, and 0 otherwise. In particular, I(n≤57) is 1 for n=0, 1, …, 57, and 0 for n=58, 59, …, 100.

Unfortunately, the last part of the likelihood we wrote is wrong! I realized this when driving home…my apologies, I decided to discuss this at the last minute. The thing is, that if there are n heads, then there are 100-n tails and we only have a binomial distribution on 100-n items, not 100 items. So, for example, if n=57 (57 heads), then there are only 100-n=43 tails and all of the tail responses must have been “no”. If n=56 then there are 44 tails, and there was one “yes” and 43 “no” responses, and so on. So the exponent on r should have been equal to (57-n), not 57 for each item. Also, because the total number of tails is changing for each line, the “choose” function should have been included; it is not constant from line to line. It should be Choose (100-n,43). So, I should have written the joint probability as follows:

$C^{100}_n I(n \le 57) C^{100-n}_{43} r^{57-n}(1-r)^{43}$

Anyway, with this correction, the joint distribution is a two-dimensional table with one entry for each combination of r and n. Just fill in the (corrected) product of prior and likelihood. Then the marginal is the sum of all the entries in the table, the posterior is gotten by dividing each entry by that marginal, and since we are interested only in the distribution on r (we don’t care about n), we get this by adding up each column and putting the column sum at the bottom of the table.

I spent the rest of the class showing pictures of some Bayesian meetings, and I played a Bayesian song. All of the materials, and much more, can be found here. There are YouTube videos, the Bayesian Songbook (with some skits as well), pointers to pages with music, and so on. Enjoy!