De-sparsified lasso contributes to construct confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in high-dimensional model.[1]
Y=X\beta0+\epsilon
n x p
X=:[X1,...,Xp]
n x p
Xj
\epsilon\simNn(0,
2 | |
\sigma | |
\epsilon |
I)
X
p x 1
\beta0
The usual method to find the parameter is by Lasso:
\hat{\beta}n(λ)=\underset{\beta\inRp}{argmin}
1 | |
2n |
\left\|Y-X\beta\right\|
2 | |
2 |
+λ\left\|\beta\right\|1
The de-sparsified lasso is a method modified from the Lasso estimator which fulfills the Karush–Kuhn–Tucker conditions[2] is as follows:
\hat{\beta}n(λ,M)=\hat{\beta}n(λ)+
1 | |
n |
MXT(Y-X\hat{\beta}n(λ))
where
M\inRp x
M
Desparsifying
l1
Consider the following
1 x p
xi\in\chi\subsetRp
yi\inY\subsetR
i=1,...,n
we have a loss function
\rho\beta(y,x)=\rho(y,x\beta)(\beta\inRp)
\beta\inRp
The
l1
\hat{\beta}=\underset{\beta}{argmin}(Pn\rho\beta+λ\left\|\beta\right\|1)
Similarly, the Lasso for node wise regression with matrix input is defined as follows:Denote by
\hat{\Sigma}
The de-sparsified
l1
\hat{\gammaj}:=\underset{\gamma\inRp-1
where
\hat{\Sigma}j,/j
j
\hat{\Sigma}
(j,j)
\hat{\Sigma}/j,/j
j
j