De-sparsified lasso explained

De-sparsified lasso contributes to construct confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in high-dimensional model.^[1]

High-dimensional linear model

Y=X\beta⁰+\epsilon

with

n x p

design matrix

X=:[X_1,...,X_p]

(

n x p

vectors

X_j

\epsilon\simN_n(0,

	2
\sigma
	\epsilon

independent of

and unknown regression

p x 1

vector

\beta⁰

The usual method to find the parameter is by Lasso:

\hat{\beta}^n(λ)=\underset{\beta\inR^p}{argmin}

	1
	2n

\left\|Y-X\beta\right\|

	2

	2

+λ\left\|\beta\right\|₁

The de-sparsified lasso is a method modified from the Lasso estimator which fulfills the Karush–Kuhn–Tucker conditions^[2] is as follows:

\hat{\beta}^n(λ,M)=\hat{\beta}^n(λ)+

	1
	n

MX^T(Y-X\hat{\beta}ⁿ(λ))

where

M\inR^{p x}

is an arbitrary matrix. The matrix

is generated using a surrogate inverse covariance matrix.

Generalized linear model

Desparsifying

l₁

-norm penalized estimators and corresponding theory can also be applied to models with convex loss functions such as generalized linear models.

Consider the following

1 x p

vectors of covariables

x_i\in\chi\subsetR^p

and univariate responses

y_i\inY\subsetR

for

i=1,...,n

we have a loss function

\rho_\beta(y,x)=\rho(y,x\beta)(\beta\inR^p)

which is assumed to be strictly convex function in

\beta\inR^p

The

l₁

-norm regularized estimator is

\hat{\beta}=\underset{\beta}{argmin}(P_n\rho_\beta+λ\left\|\beta\right\|₁₎

Similarly, the Lasso for node wise regression with matrix input is defined as follows:Denote by

\hat{\Sigma}

a matrix which we want to approximately invert using nodewise lasso.

The de-sparsified

l₁

-norm regularized estimator is as follows:

\hat{\gamma_j}:=\underset{\gamma\inR^p-1

}(\hat_ - 2 \hat_ \gamma + \gamma^T \hat_ \gamma + 2 \lambda_j\left\|\gamma\right\|_1

where

\hat{\Sigma}_j,/j

denotes the

th row of

\hat{\Sigma}

without the diagonal element

(j,j)

, and

\hat{\Sigma}_/j,/j

is the sub matrix without the

th row and

th column.

Notes and References

Geer . Sara van de . Buhlmann . Peter . Ritov . Ya'acov . Dezeure . Ruben . On Asymptotically Optimal Confidence Regions and Tests for High-Dimensional Models . The Annals of Statistics . 2014 . 42 . 3 . 1162–1202 . 10.1214/14-AOS1221 . 1303.0518 . 9663766 .
Web site: Tibshirani. Ryan. Gordon. Geoff. Karush-Kuhn-Tucker conditions.