Issues of heterogeneity in duration models can take on different forms. On the one hand, unobserved heterogeneity can play a crucial role when it comes to different sampling methods, such as stock or flow sampling.[1] On the other hand, duration models have also been extended to allow for different subpopulations, with a strong link to mixture models. Many of these models impose the assumptions that heterogeneity is independent of the observed covariates, it has a distribution that depends on a finite number of parameters only, and it enters the hazard function multiplicatively.[2]
One can define the conditional hazard as the hazard function conditional on the observed covariates and the unobserved heterogeneity.[3] In the general case, the cumulative distribution function of ti* associated with the conditional hazard is given by F(t|xi, vi ; θ). Under the first assumption above, the unobserved component can be integrated out and we obtain the cumulative distribution on the observed covariates only, i.e.
G(t ∨ xi ; θ, ρ) = ∫ F (t ∨ xi, ν ; θ) h (ν ; ρ) dν [4]
where the additional parameter ρ parameterizes the density of the unobserved component v. Now, the different estimation methods for stock or flow sampling data are available to estimate the relevant parameters.
A specific example is described by Lancaster. Assume that the conditional hazard is given by
λ(t ; xi, vi) = vi exp (x [5] β) α t α-1
where x is a vector of observed characteristics, v is the unobserved heterogeneity part, and a normalization (often E[''v<sub>i</sub>''] = 1) needs to be imposed. It then follows that the average hazard is given by exp(x'β) αtα-1. More generally, it can be shown that as long as the hazard function exhibits proportional properties of the form λ (t ; xi, vi) = vi κ (xi) λ0 (t), one can identify both the covariate function κ(.) and the hazard function λ(.).[6]
Recent examples provide a nonparametric approaches to estimating the baseline hazard and the distribution of the unobserved heterogeneity under fairly weak assumptions.[7] In grouped data, the strict exogeneity assumptions for time-varying covariates are hard to relax. Parametric forms can be imposed for the distribution of the unobserved heterogeneity,[8] even though semiparametric methods that do not specify such parametric forms for the unobserved heterogeneity are available.[9]