MRF optimization via dual decomposition explained

In dual decomposition a problem is broken into smaller subproblems and a solution to the relaxed problem is found. This method can be employed for MRF optimization.[1] Dual decomposition is applied to markov logic programs as an inference technique.[2]

Background

Discrete MRF Optimization (inference) is very important in Machine Learning and Computer vision, which is realized on CUDA graphical processing units.[3] Consider a graph

G=(V,E)

with nodes

V

and Edges

E

. The goal is to assign a label

lp

to each

p\inV

so that the MRF Energy is minimized:

(1)

min\Sigmap\in\thetap(lp)+\Sigmapq\in\thetapq(lp)(lq)

Major MRF Optimization methods are based on Graph cuts or Message passing. They rely on the following integer linear programming formulation

(2)

minxE(\theta,x)=\theta.x=\sump\thetap.xp+\sumpq\thetapq.xpq

In many applications, the MRF-variables are -variables that satisfy:

xp(l)=1

\Leftrightarrow

label

l

is assigned to

p

, while

xpq(l,l\prime)=1

, labels

l,l\prime

are assigned to

p,q

.

Dual Decomposition

The main idea behind decomposition is surprisingly simple:

  1. decompose your original complex problem into smaller solvable subproblems,
  2. extract a solution by cleverly combining the solutions from these subproblems.

A sample problem to decompose:

minx\Sigmaifi(x)

where

x\inC

In this problem, separately minimizing every single

fi(x)

over

x

is easy; but minimizing their sum is a complex problem. So the problem needs to get decomposed using auxiliary variables

\{xi\}

and the problem will be as follows:
min
\{xi\

,x}\Sigmaifi(xi)

where

xi\inC,xi=x

Now we can relax the constraints by multipliers

\{λi\}

which gives us the following Lagrangian dual function:
i\}) =min
g(\{λ
\{xi\inC\

,x}\Sigmaifi(xi)+\Sigmaiλi.(x

i-x)=min
\{xi\inC\

,x}\Sigmai[fi(xi)i.x

i]-(\Sigma
i

λi)x

Now we eliminate

x

from the dual function by minimizing over

x

and dual function becomes:
i\})=min
g(\{λ
\{xi\inC\
}\Sigma_i[f^i(x^i) + \lambda^i.x^i]

We can set up a Lagrangian dual problem:

(3)

max
\{λi\

\inΛ}

i})=\Sigma
g({λ
i

gi(xi),

The Master problem

(4)

gi(x

i)=min
xi

fi(xi)i.xi

where

xi\inC

The Slave problems

MRF optimization via Dual Decomposition

The original MRF optimization problem is NP-hard and we need to transform it into something easier.

\tau

is a set of sub-trees of graph

G

where its trees cover all nodes and edges of the main graph. And MRFs defined for every tree

T

in

\tau

will be smaller. The vector of MRF parameters is

\thetaT

and the vector of MRF variables is

xT

, these two are just smaller in comparison with original MRF vectors

\theta,x

. For all vectors

\thetaT

we'll have the following:

(5)

\sumT

T=
\theta
p

\thetap,\sumT

T=
\theta
pq

\thetapq.

Where

\tau(p)

and

\tau(pq)

denote all trees of

\tau

than contain node

p

and edge

pq

respectively. We simply can write:

(6)

E(\theta,x)=\sumTE(\thetaT,xT)

And our constraints will be:

(7)

xT\in\chiT,

T=x
x
|T

,\forallT\in\tau

Our original MRF problem will become:

(8)

min
\{xT\

,x}\SigmaTE(\thetaT,xT)

where

xT\in\chiT,\forallT\in\tau

and

xT\inx|T,\forallT\in\tau

And we'll have the dual problem we were seeking:

(9)

max
\{λT\

\inΛ}g(\{λT\})=\sumTgT(λT),

The Master problem

where each function

gT(.)

is defined as:

(10)

gT(λ

T)=min
xT

E(\thetaT+λT,xT)

where

xT\in\chiT

The Slave problems

Theoretical Properties

Theorem 1. Lagrangian relaxation (9) is equivalent to the LP relaxation of (2).

min
\{xT\

,x}\{E(x,

T
\theta)|x
p,x

\inCONVEXHULL(\chiT)\}

Theorem 2. If the sequence of multipliers

\{\alphat\}

satisfies

\alphat\geq0,\limt\alphat=0,

infin
\sum
t=0

\alphat=infin

then the algorithm converges to the optimal solution of (9).

Theorem 3. The distance of the current solution

\{\thetaT\}

to the optimal solution

\{\bar\thetaT\}

, which decreases at every iteration.

Theorem 4. Any solution obtained by the method satisfies the WTA (weak tree agreement) condition.

Theorem 5. For binary MRFs with sub-modular energies, the method computes a globally optimal solution.

Notes and References

  1. MRF Optimization via Dual Decomposition.
  2. 10.1109/icdm.2012.96 . 2012 . IEEE . Feng Niu and Ce Zhang and Christopher Re and Jude Shavlik . Scaling Inference for Markov Logic via Dual Decomposition . 2012 IEEE 12th International Conference on Data Mining . 10.1.1.244.8755 .
  3. 10.1109/btas.2013.6712721 . 2013 . IEEE . Shervin Rahimzadeh Arashloo and Josef Kittler . Efficient processing of MRFs for unconstrained-pose face recognition . 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS) .