Sunday, December 4, 2011

Statistics ---- Calculating the Bias of an Estimator?

Hi I am in an intro to statistics course. We have been taught that the bias of an estimator is equal to the expected value minus mu(greek letter). I understand that the expected value is equal to the mean but also that mu is equal to the mean as well. Doesn't that mean the bias will always be zero? Main question: what is the difference between mu and expected value??|||Suppose you have some estimator for a value k. The bias is the expected value of the estimator less the actual value. The prototypical example of bias is the standard deviation.





In other words, if E[estimatorofk] = k then estimatorofk is an unbiased estimator of k.





Suppose you have n iid random variables with mean mu and variance sigmasquared.





鈭憍n/n = xbar %26lt;--- estimator for mu





Think of xbar as an x with a bar on top of it





E[xbar] = 鈭慐[xn]/n = (n*mu)/n = mu





E[xbar] - mu = 0 so the bias of the estimator xbar is 0.





Now suppose you want to estimate the variance sigmasquared with the estimator ssquared. I will show that





ssquared = 鈭?xn - xbar)^2/n





is a biased estimator of sigmasquared.





E[ssquared] = E鈭?xn - xbar)^2/n





E[ssquared] = E鈭慬(xn - mu) - (xbar - mu)]^2 /n





E[ssquared] = E鈭慬(xn - mu)^2 + (xbar - mu)^2 - 2(xbar - mu)(xn - mu)]/n





E[ssquared] = E{鈭?xn - mu)^2 + 鈭?xbar - mu)^2 - 2(xbar - mu)鈭?xn - mu)} /n





E[ssquared] = E{鈭?xn - mu)^2 + (xbar - mu)^2 - 2(xbar - mu)^2} /n





E[ssquared] = E[(xn - mu)^2/n - (xbar - mu)^2





E[ssquared] = sigmasquared - (xbar - mu)^2








So this shows that E[ssquared] %26lt; sigmasquared and therefore:





ssquared is a biased estimator of sigmasquared.|||I am assuming that you are talking about the bias of an estimator for the mean. You are correct in thinking that mu is usually used to represent the expected value of the population or of a probability distribution. However, the expected value of the estimator may or may not be equal to mu. If it is, the bias is zero and the estimator is said to be unbiased. Otherwise, the estimator is said to be biased. For example the sample mean (x-bar) is an unbiased estimator for mu but the sample standard deviation (s) is a biased estimator for sigma. The bias for s is its expected value minus sigma.|||You're mixing up a general concept with a specific example.





The bias of an estimator is the difference between the expected value of the estimator and the true value of the parameter you're trying to estimate.





The population mean (mu) is only one possible parameter you might try to estimate, and the sample mean (sum[observations]/# of observations) is only one way of estimating mu.





It's true that the sample mean is an unbiased estimator of the population mean, but what if we used a different estimator?





EXAMPLE





Suppose I decided to try to estimate the population mean by summing up all but the largest observation and dividing by my sample size minus one. That would be a valid estimator.





Say the population is (1,2,3,4). The true mean is 2.5. Say I take samples of size three from the population. There are four possible samples each of which will occur one fourth of the time:





[(2,3,4),(1,3,4),(1,2,4),(1,2,3)]





If I apply my proposed estimator to these samples (try it yourself) I'd get:





[2.5, 2, 1.5, 1.5]





Each of those is an estimate of the mean generated by my estimator. If I take the expected value of these estimates and compare it to the true mean, I'll find the bias of my estimator (under these very specific circumstances). Doing so gives:





1.875





So it turns out my estimator is biased on the low side (the true value was 2.5).





Hopefully that helps. The thing I want to take the expected value of is all of the possible estimates my estimator could generate. The true mean is 2.5, while the expected value of my estimate is 1.875: a bias of -.625

No comments:

Post a Comment