In mathematics, mirror descent is an iterative optimization algorithm for finding a local minimum of a differentiable function.
It generalizes algorithms such as gradient descent and multiplicative weights.
Mirror descent was originally proposed by Nemirovski and Yudin in 1983.[1]
In gradient descent with the sequence of learning rates
(ηn)n
F
x0
F,
x0,x1,x2,\ldots
xn+1=xn-ηn\nablaF(xn), n\ge0.
This can be reformulated by noting that
xn+1=\argminx\left(F(xn)+\nabla
T | |
F(x | |
n) |
(x-xn)+
1 | |
2ηn |
\|x-
2\right) | |
x | |
n\| |
In other words,
xn+1
F
xn
\|x-
2 | |
x | |
n\| |
This squared Euclidean distance term is a particular example of a Bregman distance. Using other Bregman distances will yield other algorithms such as Hedge which may be more suited to optimization over particular geometries.[2] [3]
We are given convex function
f
K\subsetRn
\| ⋅ \|
Rn
We are also given differentiable convex function
h\colonRn\toR
\alpha
\nablah\colonRn\toRn
Starting from initial
x0\inK
\thetat\leftarrow\nablah(xt)
\thetat+1\leftarrow\thetat-ηt\nablaf(xt)
x't+1\leftarrow(\nablah)-1(\thetat+1)
K
xt+1\leftarrowargminxDh(x||x't+1)
Dh
Mirror descent in the online optimization setting is known as Online Mirror Descent (OMD).[4]