Today we talked about independence. In connection with the notes, obviously you shouldn’t be dividing by P(B) when P(B)=0. In practice this is less of a problem than it first appears. For example, suppose that B is very, very implausible. I mentioned Russell’s teapot, the statement B that there is a teapot orbiting the Sun out beyond the orbit of Neptune. Now, this supposed teapot almost certainly does not exist, but you cannot say for sure that P(B)=0; maybe some aliens set a teapot going around the Sun 100 million years ago, we simply cannot say with certainty that the teapot isn’t there, only that the probability that it is there is very, very small. About the only time that we can say for sure that the probability is zero is if the proposition is an absurdity, a logical contradiction, such as B=(A&(not-A)).

We had a discussion about medical testing that may not be as valuable as it first seems, such as PSA testing for prostate cancer. This was in connection with the charts that talked about dependence not being equivalent to causality.

I noted that any method of proving independence or dependence is OK; personally when faced with tables such as on Charts 50&51 my preference is to compute the marginals and just verify that the individual joint probabilities are or are not the products of the corresponding marginals.

I made some remarks about frequentist estimators and showed a simple example in R.

Then we went to the next chart set. We talked about Bayes’ theorem and the Bayesian mantra: Posterior is proportional to prior times likelihood. I mentioned that the word “posterior” engenders lots of Bayesian humor, which comes out in places like the Cabaret that closes typical Bayesian meetings, or the skits and songs that have been written for those performances.

I noted that the likelihood is numerically proportional to the sampling distribution, but that whereas the sampling distribution is a probability that describes hypothetical data given some fixed hypothesis, and is thus a function of the hypothetical data, the likelihood is not a probability, and is a function of the hypotheses given some fixed observed data. The likelihood can be multiplied by a (non-zero) constant and still be a valid likelihood, as the constant will cancel out when we divide by the denominator in Bayes’ theorem. I noted that there are strategies that allow us in many cases to bypass the calculation of the denominator (which is known as the *marginal likelihood* or the *probability of the data*).

September 8, 2012 at 10:33 pm |

I did the proof of at slide 47 in the longest way possible. Was wondering if I can get a concise one instead. The slide is asking to prove that

IF

P(A|B) != P(A)

THEN

P(B|A) != P(B)

Here is what I did:

P(A|B) = P(A&B) / P(B)

P(A&B) = P(A) * P(B|A)

P(A|B) = P(A) * P(B|A) / P(B)

Since

P(A|B) != P(A)

THEN

P(B|A) / P(B) != 1

Therefore, P(B|A) != P(B)

Q.E.D.

Does this look right to anyone?

Thanks!

Please note: != is the not-equal operator

September 8, 2012 at 11:24 pm |

I did the proof on slide 47 but I think it can be done better. The slide is asking to prove the following:

IF

P(A|B) != P(A)

THEN

P(B|A) != P(B)

By definition of conditional probability:

P(A & B) = P(B) . P(A|B)

P(A|B) = P(A & B) / P(B)

By Bayes’s theorem

P(A|B) = P(A) . P(B|A) / P(B)

SINCE P(A|B) != P(A)

THEN P(B|A) / P(B) != 1

THEREFORE, P(B|A) != P(B)

Q.E.D

Does this look good

September 8, 2012 at 11:36 pm |

Yes, that looks fine. As Cathy noted, one has to assume P(B)≠0 but I explained that in class and above.

September 9, 2012 at 12:20 am |

Another question:

I am unable to make sense to the tables on slides 51 and 52. I know how to show how two events are independent but the tables don’t seem to have all the info. This is not the case though because we’ve gone over them in class. It was too quickly for me to follow though.

Thanks,

September 9, 2012 at 1:35 am |

You have to do a little extra work. Yes, the tables don’t tell you directly how to decide, but if you compute the marginals (sum across, and sum down), and then multiply two marginals and see if the product is equal to the corresponding entry in the table, you can decide if it is dependent or independent.

For example, the marginal for the first row of the first table on chart 50 is 0.18+0.12=0.30; the marginal for the first column is 0.18+0.42=0.60; the product of these two numbers is 0.3×0.6=0.18, which is the (1,1) entry in the table. The same thing happens for every entry in the main table.

The marginals are the same for the second table, but the product isn’t 0.18, it is 0.20.