In least squares estimation problems, sometimes one or more regressors specified in the model are not observable. One way to circumvent this issue is to estimate or generate regressors from observable data.[1] This generated regressor method is also applicable to unobserved instrumental variables. Under some regularity conditions, consistency and asymptotic normality of least squares estimator is preserved, but asymptotic variance has a different form in general.
Suppose the model of interest is the following:
yi=g(x1i,x2i,\beta)+ui
where g is a conditional mean function and its form is known up to finite-dimensional parameter β. Here
x2i
x2i=h(wi,\gamma)
\gamma
yi=g(x1i,x2i,\beta)+ui
\hat\gamma
\gamma
wi
\hat{x2i
This problem falls into the framework of two-step M-estimator and thus consistency and asymptotic normality of the estimator can be verified using the general theory of two-step M-estimator.[4] As in general two-step M-estimator problem, asymptotic variance of a generated regressor estimator is usually different from that of the estimator with all regressors observed. Yet, in some special cases, the asymptotic variances of the two estimators are identical. To give one such example, consider the setting in which the regression function is linear in parameter and unobserved regressor is a scalar. Denoting the coefficient of unobserved regressor by
\delta
\delta=0
E[\triangledown\gammah(W,\gamma)U]=0
With minor modifications in the model, the above formulation is also applicable to Instrumental Variable estimation. Suppose the model of interest is linear in parameter. Error term is correlated with some of the regressors, and the model specifies some instrumental variables, which are not observable but have the representation
zi=h(wi,\gamma)
\gamma
\hat\gamma
\hatzi=h(wi,\hat\gamma)
E[\triangledown\gammah(W,\gamma)]=0[4]