## HCOL 195 10/5/09

The first thing I did today was to remark that the connection between the sums we do (for example in the cure rate problem) and integration is that the numbers in the spreadsheet are actually the integrals of some function between the lower and upper bounds of the interval. So, if put 0.05, 0.15, 0.25, … into the spreadsheet as the states of nature, then the value of the posterior at 0.05 is the integral from 0.0 to 0.1, the value at 0.15 is the integral from 0.1 to 0.2, etc. $P(0.05)=f(0.05)\Delta x\approx\int_{0.0}^{0.1}{f(x)dx}$. Some people were taking the posterior and trying to form the integral by multiplying, e.g., P(0.05) by $\Delta x$, but you don’t need to do that, it already has that factor in it, as you can see. If you want to plot the curve for f(x), the area under which is supposed to be 1, you should divide P(x) by $\Delta x$: From the formula we see that $f(x)\approx P(x)/\Delta x$ when x=0.05, 0.15, 0.25, … Here are the two shots of the blackboard I put up when I discusssed this:

The graph

I mentioned that one student had written about the “Prosecutor’s Fallacy,” when someone mistakes P(A|B) for P(B|A). This is commonly seen in criminal cases (hence the name), when a prosecutor will talk about the probability of a DNA match “at random” being very small, and then say that the probability of the accused being innocent is very small in the light of the DNA match.

I also noted that some students had taken the tools we’ve been developing and used them to try to answer some questions that interested them; such as, how many students from his or her home state are at UVM, given a sample of licence plates from the parking lot, or the probability that someone would be dressed in exactly the same way (color of clothes) that he or she chose to wear that day. I love it when you write journal articles of this sort.

We then returned to the oil well problem. I noted (as I did on the last blog) that it’s best to locate the final toll gate (the cost of drilling the well) to the right of the probability:

Where to locate the toll gate

When we do this, the rules for evaluating the decision tree are very simple. Start at the extreme right of each branch with the value of the gain or loss. Proceed to the left, item by item. When you encounter a toll gate, add that number to the current value to get the new current value. When you encounter a probability, multiply the current value by the probability to get the new current value. When you encounter a chance (round) node, add the current values of each branch together to get the current value of the node. When you encounter a decision (square) node, choose the current value running into that node that is greatest (if gains) or least (if losses) and assign it to that node, so as to get the outcome with the best expected result. Cut off the other nodes going into the decision node. When the tree is completely evaluated, only one decision will survive, and that’s the one to be chosen.

I then asked if there were other decisions that we could have entertained (even stupid ones) in the oil well problem, other than “do nothing,” “just drill,” and “test and drill if test is positive.” We thought of several others, “test and drill if negative,” “test and don’t drill,” “test and drill regardless of the test results.” These all seemed pretty dumb, but we evaluated them anyway. “test and drill regardless of the test results” obviously has gain of (-2) since “just drill” had gain of (+2) and we are now paying for the test even though that money is wasted (-4). Similarly, “test and don’t drill” costs us \$4M, so that node is obviously (-4). After some thought we figured out that by setting the toll gates to zero, then “test and drill if positive” (+8.2 without the toll gate) plus “test and drill if negative” (x) is equivalent to “test and drill regardless” (2 without the toll gate). That gives us the equation 8.2 + x = 2, or x=-6.2. Adding back the toll gate gives us -10.2 for this decision.

(I just looked at the picture and realized that on the top branch I should have put the toll gate to the right of the figure 4.2, since the 4.2 already included the effect of the toll gate.):

Other possible decisions added to the tree

We then evaluated the actual branch from first principles, and got the same reasult:

Finally we started looking at flu vaccinations. Should one get vaccinated or not? I had gotten from a newscast the previous evening that the flu kills about one person in 1000. The vaccine could be 80-90% effective, in terms of providing enough antibodies to prevent you from catching the disease. We tentatively put 90%. The remaining 10% might catch the disease, but would have some protection so probably a less severe case with a lower probability of dying. Not knowing anything, we guessed 1 in 10,000 might die of those vaccinated who get sick. A student found that 4 in 10 of unvaccinated people might get the flu. We applied this number also to those whose vaccination hadn’t given them sufficient protection (I’m not sure this is the right way to do it; we may revise it later). Here’s the tree as we left it at the end of class:

Partial tree for flu

We still have to consider side effects (GBS or Guillain-Barre Syndrome), which affects about 1 in 1,000,000 vaccinated people and may leave them with some degree of paralysis, if not treated, and death in about 1 in 20 cases; and the more common possibility that a person allergic to eggs could suffer an allergic reaction (not a problem if treated immediately, but a possibility of death if it were not treated, no statistics available on this, though most people allergic to eggs probably are aware of this and would avoid the shot).