Sunday, December 4, 2011

Estimator bias - definition and example?

bias(theta_hat) = E(theta_hat) - theta





I think theta_hat is a random variable based on some estimator and theta is the actual parameter value. But isn't the real theta unknown and we are trying to use an estimator to estimate it? If so then how can we find the bias using this definition at all?





In a binomial example, where n = 8 and Y is the number of success, find f_p_hat (x) and bias(p_hat). I don't know what f_p_hat(x) and the x means.|||Yes, you're correct, theta_hat is a random variable and when you compute the expectation E(theta_hat) you will use a probability distribution with an unknown parameter theta. In general bias(theta) = E(theta_hat) - theta is then a function of this parameter theta, it could happen that this is a constant function (doesn't vary with theta).





In your binomial example, you didn't say what estimator for p that you want to use, but I guess it is p_hat = Y/8.





f_p_hat(x) means the probability distribution for the random variable p_hat. In this example p_hat could take the values 0/8, 1/8, 2/8, ..., 8/8. The probability that p_hat is k/8 is the same as the probability that Y=k, this means the probability that you get k successes in the 8 trials. This probability is





f_p_hat(k/8) = C(8,k)*p^k*(1-p)^(8-k) for k=0,1,...,8





and f_p_hat(x) = 0 for all other values of x.





(C(n,k) = n!/(k!*(n-k)!) is the binomial coefficient)





Now that you have f_p_hat(x) you can compute E(p_hat)





E(p_hat) = sum f_p_hat(k/8)*k/8





The sum (and all sums in the sequel) goes from k=0,1,...,8.





So you get





E(p_hat) = sum C(8,k)*p^k*(1-p)^(8-k)*k/8





Now consider the function f(a,b) = sum C(8,k)*a^k*b^(8-k) = (a+b)^8 and differentiate it with respect to a to get


df(a,b)/da = sum C(8,k)*k*a^(k-1)*b^(8-k)


and you see that





E(p_hat) = p/8*df(p,1-p)/da





This means you differentiate first, then evaluate the derivative in the point (a,b) = (p, 1-p). Now to evaluate this function we can use the other expression for f(a,b) and differentiate that to get





df(a,b)/da = 8*(a+b)^7





Substituting a=p, b=1-p gives us df(p,1-p)/da = 8





So finally





E(p_hat) = p/8 * 8 = p





Thus bias(p_hat) = p - p = 0 and this is an unbiased estimator. The function turned out to not depend on p and more than that, it is 0 (which means the estimator is unbiased).

No comments:

Post a Comment