Symmetric rank-one explained

The Symmetric Rank 1 (SR1) method is a quasi-Newton method to update the second derivative (Hessian)based on the derivatives (gradients) calculated at two points. It is a generalization to the secant method for a multidimensional problem.This update maintains the symmetry of the matrix but does not guarantee that the update be positive definite.

The sequence of Hessian approximations generated by the SR1 method converges to the true Hessian under mild conditions, in theory; in practice, the approximate Hessians generated by the SR1 method show faster progress towards the true Hessian than do popular alternatives (BFGS or DFP), in preliminary numerical experiments.^[1] ^[2] The SR1 method has computational advantages for sparse or partially separable problems.^[3]

A twice continuously differentiable function

x\mapstof(x)

has a gradient (

\nablaf

) and Hessian matrix

: The function

has an expansion as a Taylor series at

x₀

, which can be truncated

f(x_0+\Deltax) ≈ f(x_0)+\nabla

	T
f(x
	0)

\Deltax+

	1
	2

\Deltax^T{B}\Deltax

its gradient has a Taylor-series approximation also

\nablaf(x_0+\Deltax) ≈ \nablaf(x_0)+B\Deltax

,which is used to update

. The above secant-equation need not have a unique solution

.The SR1 formula computes (via an update of rank 1) the symmetric solution that is closest to the current approximate-value

B_k

B_k+1=B_k+

_k\Deltax_k)(y_k-B_k\Delta

	T
x
	k)

k-B

_k\Delta

	T
x
	k)

\Deltax_k

k-B

,where

y_k=\nablaf(x_k+\Deltax_k)-\nablaf(x_k)

.The corresponding update to the approximate inverse-Hessian

H_k=B

	-1

	k

H_k+1=H_k+

(\Delta

x_k-H_ky_k)(\Deltax_k-H_k

	T
y
	k)

(\Delta

x_k-H_k

	T
y
	k)

y_k

One might wonder why positive-definiteness is not preserved — after all, a rank-1 update of the form

B_k+1=B_k+vv^T

is positive-definite if

B_k

is. The explanation is that the update might be of the form

B_k+1=B_k-vv^T

instead because the denominator can be negative, and in that case there are no guarantees about positive-definiteness.

The SR1 formula has been rediscovered a number of times. Since the denominator can vanish, some authors have suggested that the update be applied only if

|\Delta

	T
x
	k

(y_k-B_k\Deltax_k)|\geqr\|\Deltax_{k\| ⋅}\|y_k-B_k\Deltax_k\|

,where

r\in(0,1)

is a small number, e.g.

10^-8

.^[4]

Limited Memory

The SR1 update maintains a dense matrix, which can be prohibitive for large problems. Similar to the L-BFGS method also a limited-memory SR1 (L-SR1) algorithm exists.^[5] Insteadof storing the full Hessian approximation, a L-SR1 method only stores the

most recentpairs

\{(s_i,y_i)

	k-1
\}
	i=k-m

, where

\Deltax_i:=s_i

and

is an integer much smallerthan the problem size (

m\lln

). The limited-memory matrix is based on a compact matrix representation

$S_k = \begin s_ & s_ & \ldots & s_ \end,$ $Y_k = \begin y_ & y_ & \ldots & y_ \end,$

$\big(L_k\big)_ = s^T_y_, \quad D_k = s^T_y_, \quad k-m \le i \le k-1$

Since the update can be indefinite, the L-SR1 algorithm is suitable for a trust-region strategy. Because of the limited-memory matrix, the trust-region L-SR1 algorithm scales linearly with the problem size, just like L-BFGS.

Notes and References

Conn. A. R.. Gould. N. I. M.. Toint. Ph. L.. Convergence of quasi-Newton matrices generated by the symmetric rank one update. Mathematical Programming. March 1991. Springer Berlin/ Heidelberg . 0025-5610. 177–195. 50. 1. 10.1007/BF01594934. 28028770 .
Khalfan . H. Fayez . R. H. . Byrd . R. B. . Schnabel . 1 . 1993 . A Theoretical and Experimental Study of the Symmetric Rank-One Update . SIAM Journal on Optimization . 3 . 1 . 1–24 . 10.1137/0803001 .
Byrd . Richard H. . Humaid Fayez . Khalfan . Robert B. . Schnabel . 1 . 1996 . Analysis of a Symmetric Rank-One Trust Region Method . SIAM Journal on Optimization . 6 . 4 . 1025–1039 . 10.1137/S1052623493252985 .
Book: Nocedal . Jorge . Wright . Stephen J. . 1999 . Numerical Optimization . Springer . 0-387-98793-2 .
Brust . J. . Erway . J.B. . Marcia . R.F. . 1 . 2017 . On solving L-SR1 trust-region subproblems . Computational Optimization and Applications . 66. 245–266 . 10.1007/s10589-016-9868-3 .

Symmetric rank-one explained

Limited Memory

See also

Notes and References