STAT 330 October 18, 2012

We finished looking at fitting straight lines using centered variables. However, I showed you that although the centered variables are uncorrelated, they are not independent. We mentioned some generalizations of linear regression: the heteroskedastic case, where the variances if different observations are different, error in x instead of y (which we solved by introducing latent variables and eliminating them explicitly), the errors-in-variables case where both x and y have errors, again solved by introducing latent variables, but this time sampling on the latent variables since we can’t use the trick of eliminating them explicitly. We ran some code to do this. I’ll be posting this later. I mentioned that you have to do something special to avoid \sigma_x and \sigma_y from being unidentified (and thus having a posterior distribution that can’t be normalized). I gave the example of p(x,y)\propto exp(-(x-y)^2), for which the marginal distributions exist and are perfectly good, whereas there is no joint distribution.

We finished up by looking at a case where two variables are highly correlated and noting that sampling in the simplistic way we have been doing it doesn’t mix very well. For some reason there were several errors in the programs on the charts, and with help from you all we were able to fix the code and get it to run.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: