Unobserved heterogeneity in duration models explained

Issues of heterogeneity in duration models can take on different forms. On the one hand, unobserved heterogeneity can play a crucial role when it comes to different sampling methods, such as stock or flow sampling.^[1] On the other hand, duration models have also been extended to allow for different subpopulations, with a strong link to mixture models. Many of these models impose the assumptions that heterogeneity is independent of the observed covariates, it has a distribution that depends on a finite number of parameters only, and it enters the hazard function multiplicatively.^[2]

One can define the conditional hazard as the hazard function conditional on the observed covariates and the unobserved heterogeneity.^[3] In the general case, the cumulative distribution function of t_i* associated with the conditional hazard is given by F(t|x_i, v_i ; θ). Under the first assumption above, the unobserved component can be integrated out and we obtain the cumulative distribution on the observed covariates only, i.e.

G(t ∨ x_i ; θ, ρ) = ∫ F (t ∨ x_i, ν ; θ) h (ν ; ρ) dν ^[4]

where the additional parameter ρ parameterizes the density of the unobserved component v. Now, the different estimation methods for stock or flow sampling data are available to estimate the relevant parameters.

A specific example is described by Lancaster. Assume that the conditional hazard is given by

λ(t ; x_i, v_i) = v_i exp (x ^[5] β) α t ^α-1

where x is a vector of observed characteristics, v is the unobserved heterogeneity part, and a normalization (often E[''v<sub>i</sub>''] = 1) needs to be imposed. It then follows that the average hazard is given by exp(x'β) αt^α-1. More generally, it can be shown that as long as the hazard function exhibits proportional properties of the form λ (t ; x_i, v_i) = v_i κ (x_i) λ₀ (t), one can identify both the covariate function κ(.) and the hazard function λ(.).^[6]

Recent examples provide a nonparametric approaches to estimating the baseline hazard and the distribution of the unobserved heterogeneity under fairly weak assumptions.^[7] In grouped data, the strict exogeneity assumptions for time-varying covariates are hard to relax. Parametric forms can be imposed for the distribution of the unobserved heterogeneity,^[8] even though semiparametric methods that do not specify such parametric forms for the unobserved heterogeneity are available.^[9]

Notes and References

Salant, S. W. (1977): Search Theory and Duration Data: A Theory of Sorts. The Quarterly Journal of Economics, 91(1), pp. 39-57
Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
Lancaster, T. (1990): The Econometric Analysis of Transition Data. Cambridge University Press, Cambridge.
Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass.
i
Lancaster, T. (1990): The Econometric Analysis of Transition Data. Cambridge University Press, Cambridge.
Horowitz, J. L. (1999): Semiparametric and Nonparametric Estimation of Quantal Response Models. Handbook of Statistics, Vol. 11, ed. by G. S. Maddala, C. R. Rao, and H. D. Vinod. North Holland, Amsterdam.
McCall, B. P. (1994): Testing the Proportional Hazards Assumption in the Presence of Unmeasured Heterogeneity. Journal of Applied Econometrics, 9, pp. 321-334
Heckman, J. J. and B. Singer (1984): A Method for Minimizing the Impact of Distributional Assumptions in Econometric Models for Duration Data. Econometrica, 52, pp. 271-320