## HCOL 195 11/11/09

Problem 1: Cheating

It seems as if we have substantial evidence of cheating.

The second problem was the taxi problem. The assumption is that the taxis are numbered consecutively from 1. We saw 7 taxis, the largest number of which was 150. We know therefore that the likelihood of there being N taxis, if N<150, is zero. We do not (and according to Bayesian theory should not) build that into the prior, since the likelihood automatically takes care of it. For N≥150, the likelihood is 1/N7.

For a prior, we noted that we are more likely to be in a small city than a large one, because small cities are more numerous. We chose a prior on N of the form 1/N, but it might have been 1/N for example, which also decreases as N increases. The rest of the calculation is a routine application of our spreadsheet method, and is shown below:

Problem 2: Taxis

We noted that the posterior probability for N=151 is about 5% smaller than that for N=150. Probably half of the posterior probability is for N≤160 or so, and most of the remainder will be for N≤175. It’s a good bet, from these data, that the number of taxis in the city is between 150 and 175, approximately.

The third problem is the first part of the drug company decision problem. There are two unknown rates of cure for the two drugs, the old one (r) and the new one (s). We have to follow the practice of re-evaluating the cure rate for the old drug, even if we have lots of data on it, because we will be using a particular sample of patients and their profile may be different from the general population. This means that we’ll have to evaluate the likelihood on a 10×10 grid, with the different values of r corresponding to different rows, and the different values of s corresponding to the different columns. For simplicity we can take the prior to be the unnormalized prior with 1 in each grid location, which means that the joint probability will (except for the factor that we get from using an unnormalized prior) be equal to the likelihood, cell by cell. For the old drug the cure/no cure statistics were 25 and 25; for the new one, 30 and 20. This means that for cure rates r and s, the entry in the likelihood cell will be of the form r25(1-r)25s30(1-s)20, as shown in the diagram for one particular cell.

Once we have the likelihoods (and the joints) calculated, we add up all of them to get the marginal, and then we may divide the marginal into each joint to obtain the corresponding posteriors, cell by cell. Then adding up the posterior probability for those cells that satisfy s>r gives us the probability that the new drug is better than the old, as shown in the board shot:

Problem 3: Drugs

s>r above the stair-stepped line. We could also have just added up the likelihoods above the stair-stepped line and divided the sum by the marginal. The answer would be the same, but the amount of work would have been less.

I asked you to think about the decision problem that is the second half of this problem for Friday.