Just a heads up: Because of Town Meeting Day on Tuesday, I expect that the graded tests will be available on Friday instead of Wednesday.

It would be useful for you to listen to this audio presentation, for Wednesday’s class.

Just another WordPress.com weblog

Just a heads up: Because of Town Meeting Day on Tuesday, I expect that the graded tests will be available on Friday instead of Wednesday.

It would be useful for you to listen to this audio presentation, for Wednesday’s class.

Short one today, since we didn’t have class due to the slippery roads.

Please, if anyone has a question about the quiz on Monday, please let me know by adding a comment to this post or by emailing me.

Bill

We first looked at the decision problem at the end of the handout. We decided that there are two actions that we could take: Immediately hire someone to put the machine into the “good” state, or produce one part and if it is “good” continue to make more parts, hoping that the “good” part indicates that the machine is in the “good” state, and if it is bad, hiring the person to fix the machine.

(Click on picture to enlarge).

I’ll mention that last year, the class thought of a third possibility: Never fix the machine, just produce parts. This one wasn’t as good as the best strategy, “produce one part and fix if it is bad.”

In response to a question, I mentioned that any decision problems on the quiz won’t be this involved. The reason is that the class is 50 minutes long, so I’ll design the test so that 10 minutes or less per question should be adequate. We spent over 30 minutes already on this problem by the time the question was asked!

I also pointed out that what I want to know is that you know how to answer each question. I won’t necessarily expect you to do a complete calculation, just indicate how the calculation goes. So, for example, if we had the problem of picking numbered balls out of a hat (and not replacing them), and had picked out #1,2,3, then a typical spreadsheet would look as below:

You should say something about how each of the numbers in the likelihood would be calculated, explain your choice of prior, and explain how the joint, the marginal, and the posterior distributions are calculated. That would be adequate. You don’t need to actually compute every number in the table if it would take too long. On the other hand, some calculations are so simple (for example, Monty Hall type problems, or problems with just two states of nature) that they can easily be calculated in a short time.

See you on Friday!

Here is a list of some Fermi problems for you to think about.

In class I mentioned that if you have a good idea how to do those Fermi problems, you should do fine on any question about them.

We looked at all but the last bullet point. Every bullet point has a homework or class discussion behind it (you can look earlier in the blog to find the discussions).

The questions on probability are summed up in the Independence and More Independence homework problems.

The Monty Hall problems were done in class, and the King and Brothers problem was homework. Imagine how I might alter these questions by, for example, changing the number of doors in Monty Hall, or the number of children in King and Brothers.

The cancer problem is exactly like the cancer problem we did in class as far as getting the posterior probabilities (1-2). Number 3, predicting the probability of someone having a positive test getting cancer is like the last part of the cookies problem in Problem Set #3. That is, once you have the posterior probability that someone has the gene after the test is done, you then use the 0.2 and 0.0002 probabilities to predict the probability of getting cancer. We discussed this part in class…it is the “posterior predictive” aspect of the cookie problem.

The galaxy problem is just the Shakespeare vs. Marlow problem in different guise.

The plagiarism problem points out that if you have a 5 as the last calculated digit, you should round up or down randomly, or else you would build a bias into your table. That would not be good. But you can use this to prove plagiarism, because in the example there are about 100 numbers where you round randomly up or down, and the probability that someone would independently get the same rounding pattern by accident is 2^{-100}, or about 10^{-30}. This would be very convincing evidence in a copyright suit.

The first urn problem is basically the fish “catch and release” problem, done several times.

The second urn problem is basically the tank counting problem.

The beetle and ant problem is like the problem we did in estimating the cure rate of a disease.

I asked you to think about the decision problem over the next few days. We will discuss it on Wednesday.

General comments: I want to know on the quiz if you know how to solve these problems. The premium will be placed on your clear explanation of what needs to be done, e.g., explain the prior you used, explain how the likelihood is calculated, and what you do next. You do not necessarily have to compute numbers in most cases, especially if the calculation involves many lines of a spreadsheet. Just convincing me that you know exactly what needs to be done should be sufficient. Of course, if I ask you for a number, you do need to calculate it.

We finished the lottery decision tree. The analysis showed that the expected value of a ticket is $1.28, which after the cost of the ticket is $0.28; Seems like a positive return, until we recognize that if you want the money right away, you’ll only get about half, and furthermore, Uncle Sam and Peter Shumlin are going to want their share. That makes the expected take negative. Lotteries are not a good way to get rich, and not a good retirement plan.

We had to calculate the probability of 2, 3, 4, … other winners. We know that the probability of an individual winning is 1/80M so the probability that no other people win is (1-p)^{N}=0.0821, where N=200M, the number of tickets. The probability that any particular individual wins and no others win is p(1-p)^{N-1}, but the N versus N-1 doesn’t make a significant difference, so it is 0.0821p. But there are N possibilities, so that makes the probability of one other winner 0.0821(Np)=0.205. That recapitulates the situation for n≤1, where n is the number of other winners.

Then we tackled the case n=2. One guess was that it should be Np^{n}(1-p)^{N-n}, but this is much too small. It’s the number for one other out of N plus one particular second person. We revised that to N^{n}p^{n}(1-p)^{N-n}, but this isn’t quite right either because it double-counts people. Another student suggested using the “choose” function, but not everyone was familiar with it. It arises when you expand a sum, like (a+b), to some power N. The calculation is shown on the chart below.

This turned out to be the right way to do it. The “choose” function is written choose(N,n) and it is calculated as

N(N-1)…(N-n+1)/n!

Choose(N,n) is the number of different ways that you can choose n objects (people here) out of a collection of N objects. So if p^{n}(1-p)^{N-n} is the probability of exactly n particular people winning, then choose(N,n)p^{n}(1-p)^{N-n} is the probability of any n winners.

Approximately, since N minus a small integer is for all practical purposes equal to N, just as (1-p)^{N-n} is for all practical purposes equal to (1-p)^{N} for small n. And choose(N,n) is approximately equal to N^{n}/n! for small n. So, the probability that we want is closely approximated by (Np)^{n}(1-p)^{N}/n!, as illustrated in the following whiteboard shot:

I noted that we can also approximate (1-p)^{N} by the same method that a calculator uses: take the log (natural logarithm), use the fact that log(1-p) is approximately -p if p is small, and then exponentiate the result. Then (1-p)^{N} is approximately , e^{-(Np) }as shown in the final whiteboard shot:

The Poisson distribution is a very important one in statistics. It is used to model random events that happen at an average rate. In our example, the random event is n other people winning a particular draw of the lottery. But it can be used to model the number of telephone calls to be placed at a particular time (enabling us to size our cell phone tower, for example, to handle the expected traffic), or the number of patients that arrive per hour at a hospital, or the number of stars in an area of the sky, or the number of decays in a radioactive sample per second, and similar types of situations.

Today we agreed to have the first quiz on Monday, February 28. I will give you a study list and we will go over it next week instead of a problem set. There will be no problem set this week. A Journal is due this Friday, but there will be no Journal due on Friday, Feb. 25

We introduced the idea of a decision tree by expanding probability trees with a new kind of node, a decision node, indicated by a square instead of a circle. Unlike a probability node, where we have no control over which branch is realized, the choice of node at a decision node is entirely up to us. Otherwise the decision tree looks very similar. We considered the problem of bringing an umbrella. If it rains and we have the umbrella, or if it doesn’t rain and we don’t have it, all is OK, and we can indicate this by putting a loss of 0 at the end of those branches. But if it rains and we don’t have the umbrella, we’ll get wet, which most people don’t like. We assigned a loss of 2 to that branch. If it doesn’t rain and we’ve been carrying the umbrella around, we might look like a dork, but that may not be as bad, and we assigned a loss of 1 to that branch.

Then we evaluated the tree. The rules are: Multiply the loss at any point by the probability of the branch it is on, and add the products to get the expected loss of the (probability) node that the branches are emanating from. Then, if there are additional probability nodes, do the same thing as you go back towards the root of the tree. If you have a decision node, on the other hand, you should choose the branch that has the least expected loss, and cut off the other branches. We decided that the probability of rain was 0.3 (get that from the web or TV). The evaluated tree is below; we chose not to bring our umbrella. Had the loss for getting wet been 3 instead of 2, we would have brought the umbrella.

Someone asked if we could do the same thing with gains instead of losses. Yes, you can. We call the gains “utilities”, and if we were using gains we would choose branches so as to maximize the expected utility.

You can’t mix loss and utilities. Pick one or the other, whichever is best suited to the problem.

Then we looked at the problem of the lottery. A particular example took place some years ago. The jackpot was $280M dollars (‘M’ means “million”), a ticket cost $1, the probability of winning was 1/80M since there were 80M distinct tickets, and 200M tickets had been sold. Should we buy a ticket?

We started a tree. Again, a decision node sits at the root of the tree (buy a ticket or don’t buy it). If we don’t buy, the gain is 0 (it’s better to use gains than losses here). But we recognized that in the case of buying the ticket, there is a possibility of more than one winning ticket. In fact, we might expect 200M/80M winning tickets, or 2.5 on average. However, we might get lucky and have the only ticket, or our luck could be very bad and there might be 5 or 6 or even more winners. We’ll have to divide the jackpot with an unknown number of others if we win. So our tree has an additional probability node, for the possible other winners, labelled 0, 1, 2, …, and gains (utilities) of $280M, $280M/2, $280M/3, and so on.

So we need to compute the probability of no additonal winners, one, two, and so forth. For no additional winners, with p=1/80M, N=200M, the probability of a particular ticket not winning is (1-p), and since the tickets are independent, the probability that they all lose is (1-p)^{N}. We calculated this number to be 0.0821 on a calculator.

Similarly, the probability that a particular ticket wins all all the others lose is p(1-p)^{N-1}. However, there are N tickets, and any of them could be a winner, so the probability of exactly one additional winner is Np(1-p)^{N-1}. For all practical purposes, (1-p)^{N}=(1-p)^{N-1}, because (1-p) is so close to 1 that one fewer factor (or even quite a number of fewer factors) won’t make a difference, at the level of accuracy that we are computing. So the probability of one additional winner is 2.5×0.0821=0.2052.

I asked you to think about the other probabilities we need: exactly 2 other winners, 3 other winners, and so forth, to discuss in class on Wednesday.

See you on Wednesday!

We continued the fish “catch and release” problem, allowing us to estimate the number of fish in a lake, when we caught 100, tagged them, let them swim around, caught another batch and noted that 10 of them were tagged. We noted that we must have seen 190 different fish, so the experiment (not the prior) will, through the likelihood, guarantee that the posterior is 0 for N<190. Here’s the spreadsheet we constructed; we represented the nonzero terms in the likelihood with squiggles but didn’t actually calculate them (this is really where a real spreadsheet would be useful).

I also put 1 for each entry in the prior, and we learned that even though we are not making the prior add up to 1, it doesn’t matter here, since when we divide by the marginal distribution in calculating the posterior, the factor we multiply the prior by to get all 1’s cancels out.

We figured out that the likelihood would be calculated as in the following chart. Each tagged fish gives a factor equal to the number of tagged fish left in the lake (after the ones we have recaptured were removed), divided by the number of fish left in the lake (after removing the tagged fish). Similarly, each untagged fish we recapture gives a factor equal to the number of untagged fish left in the lake, divided by the total number of fish left in the lake. Each fish captured changes the number of that kind of fish in the lake as well as the total number of fish. We first did this for N=190. We figured out how to do this for a general N. We saw that if N<190, there will be a 0 factor somewhere in the product, which yields a 0 likelihood. We finally figured out how to use the factorial function n! to express the likelihood somewhat more simply.

The graph of the posterior distribution looks something like this. It’s an asymmetric “bell-shaped curve”, with a maximum near 1000, 0 for N<190, and asymptotes to 0 on the high end (but never quite gets to 0 unless the prior is 0 for some maximum number of fish).

We then turned to the problem of counting German tank production, as was done in World War II by the allies, by looking at the serial numbers of tanks that were captured. The Germans numbered their tanks sequentially, which gave the allies a way to estimate how many tanks they had produced. We did this with a simple example. We supposed that we had captured tanks #10, 5 and 11 and asked, how does this model predict the number of tanks produced? The spreadsheet looks somewhat similar to the fish calculation in that we see that if we have seen Tank #11, then there have to be at least 11 tanks and the likelihood will be 0 for N<11.

To calculate the likelihood, we see that if there are just 11 tanks, the probability of observing the first tank captured is 1/11, the second tank captured 1/10, and the third 1/9. The product of these is the likelihood if N=11. Similar considerations work for N=12, 13, and so on, except that we will start with a larger denominator, decreasing by 1 each time. We see that the likelihood decreases as N increases (for N=11, 12, 13, …). This means that the posterior is going to decrease as N gets larger and larger.

So the posterior distribution looks like this:

This is *not* like a bell-shaped curve! This is why taking averages and so on is not a good way to think about this problem.

Have a nice weekend!

Please continue reading Flip, chapters 5-9, and Risks, chapters 5-7.

I got email from a student about journal topics. Here is the email, and my response.

I was just wondering if the weekly journal has to be something we discussed in class, or it can be some probability problems that I think are interesting. I just feel it is more interesting to write something that I am interested in.

It can be anything at all that has some connection to the class. A probability problem that you think is interesting is connected to the class since it involves probability. A decision problem that you think is interesting is connected to the class since it involves making decisions. Use your imagination!

Bill

Here are today’s whiteboard snaps

A

B

C

D

E

F

More to come…