The Times today had another article on cancer screening, mostly breast cancer but also some discussion of prostate cancer. Pretty similar to what we discussed in class, except that they assumed in the 40-50 year age group that only 1 in 1000 women have cancer. This leads to a less than 1% chance that a woman in that age group has cancer, given a positive mammogram (as compared to somewhat under 10% in the larger group that includes older women that we discussed in class). At the same time many women will have false positives that will lead to unnecessary treatment including chemotherapy, radiation or even mastectomies.

## Archive for the ‘HCOL 195’ Category

### More on Cancer Screening

December 20, 2009### When Lowering the Odds of Cancer Isn’t Enough

December 15, 2009Yet another interesting article, which discusses the risks and benefits of taking a drug as a prophylactic to prevent breast cancer in women at high risk. The comments are worth reading.

### Mammogram Math

December 13, 2009The New York Times today ran an article on mammogram math. No surprises here for anyone in the class, but it is interesting to see how the things we discussed are now being explained in a major newspaper.

### Professor Risk

December 11, 2009If you are still reading the blog, I became aware today of this clip, posted on YouTube (along with several related items), and this blog, which is on the general topic of uncertainty (probability) and risk…the same as the topic of the course. Note in particular this article from the blog, which discusses some of the things we talked about when we were talking about criminal trials.

The star of the video clip is Professor David Spiegelhalter, a well-known Bayesian statistician from Britain.

### Last week

December 5, 2009Monday we will hear from Dr. Turner Osler, of the medical school. Please be sure to have read the handout I gave you last week.

Wednesday we will have a party.

Please be sure to turn in any outstanding work that you may owe me by Monday so that I can work out your tentative grades.

### New article on the breast cancer issue

November 23, 2009This article details some of the reasoning that was used by the recent panel recommendations regarding breast cancer screening that we’ve discussed in class. This is worthwhile reading.

### HCOL 195 11/20/09

November 20, 2009I’ve put the link to the NPR discussion I mentioned in class here and below.

And here is an interesting article from the New York Times that describes the history of breast cancer ideas, going back to the 19th century. Some of the debate today is very old.

The distinguishing characteristic of Question 1 is that there are two independent cure rates; this means that the table of posterior probabilities must also be two-dimensional, that is, a square table. The posterior probability of each combination of cure rates r and s is the product of the posterior probability of r and that of s, and it is put into the appropriate square in the grid. Then the probability that the cure rates are equal is given by the sum along the main diagonal (red boxes in the picture), and the probability that B is better than A is the sum of all the numbers above the main diagonal, where s is greater than the corresponding r.

In the general picture, we see that according to the probabilities, the expected number of survivors is 200 in each case, but in the risky case, it’s possible that everyone will be killed. The risk-averse general would choose the sure thing rather than risk everyone being killed. What that really means is that the value of the 600 soldiers is less than three times the value of the 200 soldiers. That’s illustrated in red in the diagram, where we put 550 rather than 600 to reflect this assessment of this general.

In this problem, the objective is to win, and so there are just two gains, one for winning and the other for losing. To make the arithmetic come out nicely, 30 is a good gain for winning and 0 for losing. There are only two possible decisions, to go for the “sure thing” of 200 soldiers, versus the “risky” choice of possibly 600 soldiers, but possibly none. For the sure thing choice, the only chance node is the one that tells us whether we win or lose. But for the risky choice, the first thing that happens is that we take the soldiers to the scene of the upcoming battle, so that’s the first chance node. Once we get to the scene of the battle (if we do), then the battle takes place, so there is a second chance node for each of the two possible outcomes from the trip. We see that the expected gain is greater if we use the “sure thing” branch.

Some background on this story can be found in the NPR story from All Things Considered, here, as I promised in class. This is a very interesting conversation, you can listen online, download an mp3 for your mp3 player, or read the transcript. I found the discussion very illuminating.

The New York Times today also had a very illuminating article about how the history of understanding cancer over the last 150+ years also influences the discussion today.

The diagram is very simple; you have the loss of one extra life ($5 M according to our class discussion) versus the cost of ten mammograms per person ($3M) plus whatever the loss is for the false positives (200, or 10% of the number in the group in the statement of the problem; but I found out that the actual number is closer to 1000). Also for simplicity I have drawn the diagram for 1 extra life saved out of 2000 women tested; the error is 5%, smaller than the error of other numbers that go into the calculation, so this is justified. We see that (in the diagram again) we will be indifferent if the additional cost C of the false positives is such that $5M=$3M + 200C. That makes C=$10,000. With the more accurate number of 1000 false positives, C=$2,000. Any C larger than that corresponds to a decision not to test.

Comments:

The actual discussion in the media indicates that they were not doing a cost-benefit analysis like the above, but rather trying to weigh the human cost of one extra life saved versus the human cost of false positives, which would include the anxiety, the pain of additional testing, the risk of additional testing (X-rays for example have a small association with cancers in and of themselves), the risk of unnecessary surgery that might cause a woman to be disfigured or even lose a breast, the risk of a woman dying due to unnecessary surgery. It’s not clear how they weighed the various costs.

Also, consider that money not spent for testing could be used for something else. Is the cost of testing in this age group the best way that we could spend the money? The amount of money available for health care is not unlimited, as Congress is finding out. It might be that more lives could be saved if the money were spent elsewhere.

Finally, as I mentioned in class, the 10% false positive rate is really for one mammogram. If you have ten mammograms in ten years, the probability is a lot closer to 50% that one of them will come up with a false positive. This explains the difference between the number I gave you in the question and the figure of 1000 false positives that the media is reporting. It also says that the probability of a false positive would be reduced if mammograms were only given every two years (five versus ten). This would reduce the negative outcomes from false positives. It would also cost less money, money that could be used in other, and possibly more productive ways. And (according to the discussions I’ve seen), the additional risk of a woman dying who could have been saved would be very small.

This one is pretty straightforward and is similar to many that we’ve done during the course. The probability of observing 7 items of the first kind, 2 of the second, and 1 of the third, is , so that’s the likelihood. The prior is given in the problem, and the rest is routine: Multiply likelihood and prior to get joint, add the joints to get the marginal, divide each joint by the marginal to get the corresponding posterior. Here we have only two states of nature. I gave full credit if you simply explained the calculation in detail but did not compute the numbers.

This is similar to the problem we discussed on the study sheet, except that instead of H/T, there are three equally probable ways that the die can come up. So the probability of each of these ways is 1/3, and the calculation of the probability of hearing “yes”, given that p is the true proportion of people who did what the subject of the question is, is (p+1)/3=0.4. This gives p=0.2.

### HCOL 195 11/16/09

November 17, 2009On Monday the first thing I did was to flesh out a bit the calculation of the probability of saying “yes” and “no” in the polling example. It’s a straightforward application of probability theory:

We then talked about the problem of setting up an “expert system” that takes input (patients, doctor’s diagnoses; emails, receiver’s opinion that it is spam), and after “learning” through a large amount of examples, can then do the diagnoses or decide on whether a new email is spam or not. We did this by considering the spam problem. By having the program look for a large number of words in the email, together with the recipient’s opinion that the message is or is not spam, allows us to estimate the conditional probabilities that a particular word (e.g., Viagra, hello) is in the message, given that the message is or is not spam. We do this by simply tallying up the number of occurrences and dividing appropriately. We can also get estimates of the prior probability of spam/not spam simply from the proportion of spam messages to the total.

However, the formulas I put on the board weren’t correct. I should have written that the conditional probability of a word, given that it is spam, is given by the number of times a word appears in a spam message, divided by the total number N of spam messages (I’m not sure what I said). That’s just

P(word|spam)=N(word,spam)/N(spam)=P(word,spam)/P(spam).

Here I’m just using the fact that

P(word,spam)=N(word,spam)/N(messages)

and

P(spam)=N(spam)/N(messages)

so that the number of messages, N(messages) cancels out.

We then used Bayes’ theorem to estimate the posterior probability of a message being spam, given that it contains words by approximating by the product of the approximate probabilities that we computed in the data-gathering phase. This approximation pretends that and are independent. Though it is an approximation, it turns out to be astonishingly good in practical applications. The result is a so-called naive Bayes classifier.

### I’ve commented on last Monday’s class

November 15, 2009I’ve commented on last Monday’s class here.

### HCOL 195 11/13/09

November 14, 2009The first problem we discussed is the second half of the drug company’s decision.

Basic decisions are to continue research on marketing the drug, or to stop. Since the “sunk costs” of the research so far are the same regardless of what we do now, we can set the loss or gain at zero (remember, the gain/loss can always be added to by an arbitrary constant, or multiplied by an arbitrary scale). Also, since the goal is to make a gain, it’s probably best to frame this decision in terms of gain (utility) rather than loss. So, “do nothing” has a gain of zero.

If we were to decide to continue the research, from the data we already have there is a probability p that the drug is better than the old one, and (1-p) that it is not. That is a probability node (we could use as our p some higher criterion, such as that the new drug is twice as good as the old one.) To test the new drug will cost $30 M (probably low, by the way). If we test it, there is a possibility that the early testing may not pan out. The probability of cure rate may end up at q, which might even be less than the p that we got in the early tests. That has to be folded into the costs of marketing, bringing the drug to market, and the possible rewards of pricing the drug so that the expected number of doses sold will (over the 20-year lifetime of the drug) handsomely reward our company and our stockholders. That is illustrated in the above chart in a very sketchy way.

Our next problem was to consider the claim of an astrologer that he has powers that allow him to predict the future with 85% acccuracy. He makes 11 predictions, 4 of which are correct. What is the probability, given that he has some powers, that he can predict the future with 85% accuracy? That leads to our usual spreadsheet (where the division into 0.05, 0.15, …., 0.95 is for illustration and is adequate for the exam):

As usual, the likelihood is p^{4}(1-p)^{7} where p takes on the values that we put into our spreadsheet. We complete the spreadsheet in the usual way, and then to decide the probability that the astrologer has proven his case, given that he has the powers he claims, we add the probabilities for the states of nature ≥ 0.85 (that is, for this spreadsheet, 0.85 and 0.95).

But that doesn’t take into account our own experience. When we discussed this in class, people seemed to be skeptical about whether some people could actually make such predictions. To focus things, I asked whether people would think that I had abilities to predict things if, you given a fair coin and tossing it yourself out of my view, I were to predict that it would come up heads or tails correctly, no one would be impressed. If I did it ten times in a row, (one chance in 1000), some would pay attention. And if I did it 100 times in a row, (one chance in 30 million) most would think that something (maybe some sort of cheating) was going on. So that leads to a short spreadsheet like the one below, which puts a very small prior probability on the hypothesis that the astrologer really has the claimed powers.

We then considered the problem of polling people when there are controversial issues that some people might lie about when polled, as for their personal drug use, or their opinions on controversial subjects. Polls can be skewed when people lie.

So, for example, the poll is constructed so that the person being polled tosses a coin. If the coin comes up “heads”, then he is instructed to answer one way always (i.e., “yes, I used drugs last week.”) If it comes up “tails”, then he is instructed to tell the truth. The idea here is the the pollster has no idea whether the person being polled used drugs or not, so that protects the privacy of that person. But nonetheless, the pollster can back out the desired information about the group. Here’s what I wrote on the board:

The point is this: We need to know the probability that a person will say “yes,” given any probability p that the person has used drugs (or whatever the question is). That’s not hard to figure out.

P(“yes”|p)=P(“yes”,H|p)+P(“yes”,T|p)

=P(“yes”|H, p)P(H|p) + P(“yes”|T,p)P(T|p)

=1*(1/2)+p*(1/2)

=(1+p)/2

I didn’t do the calculation this carefully, but this is the result. For the probability of “something happens,” the probability of “says no”|p is (1-p)/2.

So, if 57% of the respondents answered “yes,” then the naive calculation is that

(1+p)/2=0.57, or p=0.14, the proportion of respondents that used drugs (or whatever the question is).

But we can do better. The quantities we’ve computed above are the likelihood function, and we can put them into a spreadsheeet:

The spreadsheet can have 10, 100, 1000, however many rows we need. The more we take, the more accurate the calculation. Or, we could, if we knew about calculus, do a fancy calculation that uses that skill. (This would not lead to better insight into this problem, only to greater accuracy.)