This article describes Lyapunov optimization for dynamical systems. It gives an example application to optimal control in queueing networks.
Lyapunov optimization refers to the use of a Lyapunov function to optimally control a dynamical system. Lyapunov functions are used extensively in control theory to ensure different forms of system stability. The state of a system at a particular time is often described by a multi-dimensional vector. A Lyapunov function is a nonnegative scalar measure of this multi-dimensional state. Typically, the function is defined to grow large when the system moves towards undesirable states. System stability is achieved by taking control actions that make the Lyapunov function drift in the negative direction towards zero.
Lyapunov drift is central to the study of optimal control in queueing networks. A typical goal is to stabilize all network queues while optimizing some performance objective, such as minimizing average energy or maximizing average throughput. Minimizing the drift of a quadratic Lyapunov function leads to thebackpressure routing algorithm for network stability, also called the max-weight algorithm.[1] [2] Adding a weighted penalty term to the Lyapunov drift and minimizing the sum leads to the drift-plus-penalty algorithm for joint network stability and penalty minimization.[3] [4] [5] The drift-plus-penalty procedure can also be used to compute solutions to convex programs and linear programs.[6]
Consider a queueing network that evolves in discrete time with normalized time slots
t\in\{0,1,2,\ldots\}.
N
t
Q(t)=(Q1(t),\ldots,QN(t))
For each slot
t,
L(t)=
1 | |
2 |
N | |
\sum | |
i=1 |
2 | |
Q | |
i(t) |
This function is a scalar measure of the total queue backlog in the network. It is called quadratic Lyapunov function on the queue state. Define the Lyapunov drift as the change in this function from one slot to the next:
\DeltaL(t)=L(t+1)-L(t)
Suppose the queue backlogs change over time according to the following equation:
Qi(t+1)=max\left\{Qi(t)+ai(t)-bi(t),0\right\}
where
ai(t)
bi(t)
i
t.
2 | |
Q | |
i(t+1) |
=\left(max\left\{Qi(t)+ai(t)-bi(t),0\right\}\right)2\leqslant\left(Qi(t)+ai(t)-bi(t)\right)2
Rearranging this inequality, summing over all
i,
\DeltaL(t)\leqslantB(t)+
N | |
\sum | |
i=1 |
Qi(t)(ai(t)-bi(t)) (Eq.1)
where:
B(t)=
1 | |
2 |
N | |
\sum | |
i=1 |
\left(ai(t)-bi(t)\right)2
Suppose the second moments of arrivals and service in each queue are bounded, so that there is a finite constant
B>0
t
Q(t)
E[B(t)|Q(t)]\leqslantB
Taking conditional expectations of (Eq. 1) leads to the following bound on the conditional expected Lyapunov drift:
E[\DeltaL(t)|Q(t)]\leqslantB+
N | |
\sum | |
i=1 |
Qi(t)E[ai(t)-bi(t)|Q(t)] (Eq.2)
In many cases, the network can be controlled so that the difference between arrivals and service at each queue satisfies the following property for some real number
\varepsilon>0
E[ai(t)-bi(t)|Q(t)]\leqslant-\varepsilon
If the above holds for the same epsilon for all queues
i,
t,
Q(t),
Theorem (Lyapunov Drift).[5] [7] Suppose there are constants
B\geqslant0,\varepsilon>0
t
Q(t)
E[\DeltaL(t)|Q(t)]\leqslantB-\varepsilon
N | |
\sum | |
i=1 |
Qi(t).
Then for all slots
t>0
1 | |
t |
t-1 | |
\sum | |
\tau=0 |
N | |
\sum | |
i=1 |
E[Qi(\tau)]\leqslant
B | |
\varepsilon |
+
E[L(0)] | |
\varepsilont |
.
Proof. Taking expectations of both sides of the drift inequality and using the law of iterated expectations yields:
E[\DeltaL(t)]\leqslantB-\varepsilon
N | |
\sum | |
i=1 |
E[Qi(t)]
Summing the above expression over
\tau\in\{0,1,\ldots,t-1\}
E[L(t)]-E[L(0)]\leqslantBt-\varepsilon
t-1 | |
\sum | |
\tau=0 |
N | |
\sum | |
i=1 |
E[Qi(\tau)]
Using the fact that
L(t)
Consider the same queueing network as in the above section. Now define
p(t)
t.
p(t).
p(t)
r(t),
p(t)=-r(t).
To stabilize the network while minimizing the time average of the penalty
p(t),
t
\DeltaL(t)+Vp(t)
where
V
V=0
V>0
p(t)
t
V>0
p(t)
A generalization of the Lyapunov drift theorem of the previous section is important in this context. For simplicity of exposition, assume
p(t)
p(t)\geqslantpmin \forallt\in\{0,1,2,...\}
For example, the above is satisfied with
pmin=0
p(t)
p*
p(t).
V
V
Theorem (Lyapunov Optimization). Suppose there are constants
\varepsilon>0,V,B\geqslant0,
p*
t
Q(t)
E[\DeltaL(t)+Vp(t)|Q(t)]\leqslantB+Vp*-\varepsilon
NQ | |
\sum | |
i(t) |
Then for all
t>0
1 | |
t |
t-1 | |
\sum | |
\tau=0 |
E[p(\tau)]\leqslantp*+
B | |
V |
+
E[L(0)] | |
Vt |
1 | |
t |
t-1 | |
\sum | |
\tau=0 |
N | |
\sum | |
i=1 |
E[Qi(\tau)]\leqslant
B+V(p*-pmin) | |
\varepsilon |
+
E[L(0)] | |
\varepsilont |
Proof. Taking expectations of both sides of the posited drift-plus-penalty and using the law of iterated expectations we have:
E[\DeltaL(t)]+VE[p(t)]\leqslantB+Vp*-\varepsilon
N | |
\sum | |
i=1 |
E[Qi(t)]
Summing the above over the first
t
\begin{align} E[L(t)]-E[L(0)]+
t-1 | |
V\sum | |
\tau=0 |
E[p(\tau)]&\leqslant(B+Vp*)t-\varepsilon
t-1 | |
\sum | |
\tau=0 |
N | |
\sum | |
i=1 |
E[Qi(\tau)]\\ -E[L(0)]+
t-1 | |
V\sum | |
\tau=0 |
E[p(\tau)]&\leqslant(B+Vp*)t&&SinceL(t),Qi(t)\geqslant0
t-1 | |
\\ V\sum | |
\tau=0 |
E[p(\tau)]&\leqslantp*Vt+Bt+E[L(0)] \end{align}
Dividing by
Vt