Active contour model, also called snakes, is a framework in computer vision introduced by Michael Kass, Andrew Witkin, and Demetri Terzopoulos for delineating an object outline from a possibly noisy 2D image. The snakes model is popular in computer vision, and snakes are widely used in applications like object tracking, shape recognition, segmentation, edge detection and stereo matching.
A snake is an energy minimizing, deformable spline influenced by constraint and image forces that pull it towards object contours and internal forces that resist deformation. Snakes may be understood as a special case of the general technique of matching a deformable model to an image by means of energy minimization.[1] In two dimensions, the active shape model represents a discrete version of this approach, taking advantage of the point distribution model to restrict the shape range to an explicit domain learnt from a training set.
Snakes do not solve the entire problem of finding contours in images, since the method requires knowledge of the desired contour shape beforehand. Rather, they depend on other mechanisms such as interaction with a user, interaction with some higher level image understanding process, or information from image data adjacent in time or space.
In computer vision, contour models describe the boundaries of shapes in an image. Snakes in particular are designed to solve problems where the approximate shape of the boundary is known. By being a deformable model, snakes can adapt to differences and noise in stereo matching and motion tracking. Additionally, the method can find Illusory contours in the image by ignoring missing boundary information.
Compared to classical feature extraction techniques, snakes have multiple advantages:
The key drawbacks of the traditional snakes are
A simple elastic snake is defined by a set of n points
vi
i=0,\ldots,n-1
Einternal
Eexternal
Eimage
Econ
The energy function of the snake is the sum of its external energy and internal energy, or
1E | |
E | |
snake(v(s))ds=\int\limits |
1 | |
0 |
(Einternal(v(s))+Eimage(v(s))+Econ(v(s)))ds
The internal energy of the snake is composed of the continuity of the contour
Econt
Ecurv
Einternal=Econt+Ecurv
This can be expanded as
E | ||||
|
(\alpha(s)\left|vs(s)\right\vert2)+
1 | |
2 |
(\beta(s)\left|vss(s)\right\vert2)=
1 | |
2 |
(\alpha(s)\left\|
d\barv | |
ds |
(s)\right\Vert2+\beta(s)\left\|
d2\barv | |
ds2 |
(s)\right\Vert2)
where
\alpha(s)
\beta(s)
In practice, a large weight
\alpha(s)
\beta(s)
Energy in the image is some function of the features of the image. This is one of the most common points of modification in derivative methods. Features in images and images themselves can be processed in many and various ways.
For an image
I(x,y)
Eimage=wlineEline+wedgeEedge+wtermEterm,
where
wline
wedge
wterm
The line functional is the intensity of the image, which can be represented as
Eline=I(x,y)
The sign of
wline
Some smoothing or noise reduction may be used on the image, which then the line functional appears as
Eline=\operatorname{filter}(I(x,y))
The edge functional is based on the image gradient. One implementation of this is
Eedge=-\left|\nablaI(x,y)\right\vert2.
A snake originating far from the desired object contour may erroneously converge to some local minimum. Scale space continuation can be used in order to avoid these local minima. This is achieved by using a blur filter on the image and reducing the amount of blur as the calculation progresses to refine the fit of the snake. The energy functional using scale space continuation is
Eedge=-\left|G\sigma ⋅ \nabla2I\right\vert2
where
G\sigma
\sigma
G\sigma\nabla2I
Curvature of level lines in a slightly smoothed image can be used to detect corners and terminations in an image. Using this method, let
C(x,y)
C(x,y)=G\sigma ⋅ I(x,y)
with gradient angle
\theta=\arctan\left(
Cy | |
Cx |
\right),
unit vectors along the gradient direction
n=(\cos\theta,\sin\theta),
and unit vectors perpendicular to the gradient direction
n\perp=(-\sin\theta,\cos\theta).
The termination functional of energy can be represented as
Eterm={\partial\theta\over\partialn\perp}={\partial2C/\partial
2 | |
n | |
\perp |
\over\partialC/\partialn}={{Cyy
2-2C | |
C | |
xy |
CxCy+Cxx
2) | |
C | |
y |
3/2
Some systems, including the original snakes implementation, allowed for user interaction to guide the snakes, not only in initial placement but also in their energy terms. Such constraint energy
Econ
Given an initial guess for a snake, the energy function of the snake is iteratively minimized. Gradient descent minimization is one of the simplest optimizations which can be used to minimize snake energy.[4] Each iteration takes one step in the negative gradient of the point with controlled step size
\gamma
\barvi\leftarrow\barvi+Fsnake(\barvi)
Where
Fsnake(\barvi)
Fsnake(\barvi)=-\nablaEsnake(\barvi)=-(winternal\nablaEinternal(\barvi)+wexternal\nablaEexternal(\barvi))
Assuming the weights
\alpha(s)
\beta(s)
s
\barvi\leftarrow\barvi-\gamma\{winternal[\alpha
\partial2\barv | |
\partials2 |
(\barvi)+\beta
\partial4\barv | |
\partials4 |
(\barvi)]+\nablaEext(\barvi)\}
In practice, images have finite resolution and can only be integrated over finite time steps
\tau
The energy function of the snake can be approximated by using the discrete points on the snake.
* ≈ | |
E | |
snake |
n | |
\sum | |
1 |
Esnake(\barvi)
Consequentially, the forces of the snake can be approximated as
* ≈ | |
F | |
snake |
-
n | |
\sum | |
i=1 |
\nablaEsnake(\barvi).
Gradient approximation can be done through any finite approximation method with respect to s, such as Finite difference.
The introduction of discrete time into the algorithm can introduce updates which the snake is moved past the minima it is attracted to; this further can cause oscillations around the minima or lead to a different minima being found.
This can be avoided through tuning the time step such that the step size is never greater than a pixel due to the image forces. However, in regions of low energy, the internal energies will dominate the update.
Alternatively, the image forces can be normalized for each step such that the image forces only update the snake by one pixel. This can be formulated as
Fimage=-k
\nablaEimage | |
\|\nablaEimage\| |
where
\tauk
The energies in a continuous image may have zero-crossing that do not exist as a pixel in the image. In this case, a point in the snake would oscillate between the two pixels that neighbor this zero-crossing. This oscillation can be avoided by using interpolation between pixels instead of nearest neighbor.
The default method of snakes has various limitation and corner cases where the convergence performs poorly. Several alternatives exist which addresses issues of the default method, though with their own trade-offs. A few are listed here.
The gradient vector flow (GVF) snake model[6] addresses two issues with snakes:
In 2D, the GVF vector field
FGVF
EGVF=\iint
2)+|\nabla | |
\mu(u | |
y |
f|2|v-\nablaf|2dxdy
where
\mu
\mu\nabla2u- (u-
\partial | |
\partialx |
F | ||||
|
2 | |
F | |
ext(x,y) |
+
\partial | |
\partialy |
2 ) | |
F | |
ext(x,y) |
=0
\mu\nabla2v- (v-
\partial | |
\partialy |
F | ||||
|
2 | |
F | |
ext(x,y) |
+
\partial | |
\partialy |
2 ) | |
F | |
ext(x,y) |
=0
This can be solved through iteration towards a steady-state value.
ui+1=ui+\mu\nabla2ui- (ui-
\partial | |
\partialx |
F | ||||
|
2 | |
F | |
ext(x,y) |
+
\partial | |
\partialy |
2 ) | |
F | |
ext(x,y) |
vi+1=vi+\mu\nabla2vi- (vi-
\partial | |
\partialy |
F | ||||
|
2 | |
F | |
ext(x,y) |
+
\partial | |
\partialy |
2 ) | |
F | |
ext(x,y) |
This result replaces the default external force.
* | |
F | |
ext |
=FGVF
The primary issue with using GVF is the smoothing term
\mu
\mu
The balloon model addresses these problems with the default active contour model:
The balloon model introduces an inflation term into the forces acting on the snake
Finflation=k1\vecn(s)
where
\vecn(s)
v(s)
k1
k1
k
k
Three issues arise from using the balloon model:
The diffusion snake model[7] addresses the sensitivity of snakes to noise, clutter, and occlusion. It implements a modification of the Mumford–Shah functional and its cartoon limit and incorporates statistical shape knowledge. The default image energy functional
Eimage
* | |
E | |
image |
=Ei+\alphaEc
where
Ei
E[J,B]=
1 | |
2 |
\intD(I(\vecx)-J(\vecx))2d\vecx+λ
1 | |
2 |
\intD/B\vec\nablaJ(\vecx) ⋅ \vec\nablaJ(\vecx)d\vecx+\nu
1 | |
\int | |
0 |
(
d | |
ds |
B(s) )2ds
where
J(\vecx)
I(\vecx)
D
B(s)
B(s)=
N | |
\sum | |
n=1 |
\vecpnBn(s)
where
Bn(s)
\vecpn
λ\toinfty
Ei
The functional
Ec
\alpha
\vecz
\vecz0
\Sigma
Ec(\vecz)=
1 | |
2 |
(\vecz-\vec
t | |
z | |
0) |
\Sigma*(\vecz-\vecz0)
The strength of this method relies on the strength of the training data as well as the tuning of the modified Mumford–Shah functional. Different snakes will require different training data sets and tunings.
Geometric active contour, or geodesic active contour (GAC)[8] or conformal active contours[9] employs ideas from Euclidean curve shortening evolution. Contours split and merge depending on the detection of objects in the image. These models are largely inspired by level sets, and have been extensively employed in medical image computing.
For example, the gradient descent curve evolution equation of GAC is
\partialC | |
\partialt |
=g(I)(c+\kappa)\vec{N}-\langle\nablag,\vec{N}\rangle\vec{N}
where
g(I)
\kappa
\vec{N}
\Phi
\partial\Phi | |
\partialt |
=|\nabla\Phi|\operatorname{div} (g(I)
\nabla\Phi | |
|\nabla\Phi| |
)+cg(I)|\nabla\Phi|
This simple yet powerful level-set reformation enables active contours to handle topology changes during the gradient descent curve evolution. It has inspired tremendous progress in the related fields, and using numerical methods to solve the level-set reformulation is now commonly known as the level-set method. Although the level set method has become quite a popular tool for implementing active contours, Wang and Chan argued that not all curve evolution equations should be directly solved by it.[10]
More recent developments in active contours address modeling of regional properties, incorporation of flexible shape priors and fully automatic segmentation, etc.
Statistical models combining local and global features have been formulated by Lankton and Allen Tannenbaum.[11]
Graph cuts, or max-flow/min-cut, is a generic method for minimizing a particular form of energy called Markov random field (MRF) energy. The Graph cuts method has been applied to image segmentation as well, and it sometimes outperforms the level set method when the model is MRF or can be approximated by MRF.