# Segmentation

## Level-set Based Segmentation in 2D

Before introducing the actual model used for segmenting 2D images, few things related to explicit and implicit curve representations are discussed. Before going any further, I would like to point out that the implicit curve representation techniques are not confined to 2D case as one might deduce from the header. In fact, level-sets are indeed used in medical imaging segmenting 3D data, such as MRI.

## Two Segments

### Explicit curve representation

The aim is to segment an input image into two different segments. The area bounded by the segment is 'similar' (homogeneous) as per the used similarity metric (e.g. similar color in the case of RGB/HSV images), while the boundary separating the segments is called the interface. Therefore, using explicit curve representation, the segmentation model can de described as an energy minimization model as follows:

$E\Big( \Gamma (s),\, \alpha_1,\, \alpha_2 \Big) = \nu \underbrace{\int_{\Gamma} 1 \, ds}_{\text{boundary length}} - \int_{\Omega_1} log \, p(I|\alpha_1) \, d\vec{x} - \int_{\Omega_2} log \, p(I|\alpha_2) \, d\vec{x}$

where the curve is parameterized by $$s$$ as $$\Gamma (s) = \Big( x(s), y(s) \Big)$$. The first term is the length of the boundary separating the segments. The second and the third terms are the 'likelihoods' indicating that a particular image pixel $$I(x,y)$$ belongs to a corresponding segment, while the parameters $$\alpha_1$$ and $$\alpha_2$$ are the likelihood functions' parameters. Likelihood is defined as follows:

$p \Big ( I | \alpha_{ i } \Big) = p \Bigg ( \{ I(x,y) : (x,y) \in \Omega_i \} | \alpha_i \Bigg )$

where $$\alpha_i$$ is a parameter vector describing the likehood function. E.g. in the case of a Gaussian distribution $$\alpha_i = ( \mu_i, \, \sigma_i^2 )$$ (mean and variance).

### Implicit curve representation

In implicit curve representation the isocontour defining the interface is one dimension lower than the dimensionality of the actual level-set function. Therefore, in $$\mathbb{R} ^n$$ the isocontour has dimension $$n-1$$. Thus, it is natural to ask what are the benefits of such implicit curve representation?

Topological changes. Topological changes are handled 'implicitly' in the implicit curve representation. In explicit representation topological changes (e.g. a curve breaking into two) can cause problems, especially in higher dimensions. Discretization. Grid size in implicit representation stays the same (Eulerian formulation), whereas in the explicit curve representation case 'regridding' might be needed.Inside/outside regions. In implicit representation it is extremely easy to see whether a point belongs to outside or inside region (as is shown below). Using an implicit curve representation (e.g. level-set function), the boundary and the segments can be defined as follows:

\begin{align} \Gamma &:= \partial \Gamma \Big\{ (x,y),\, \Phi(x,y) = 0 \Big\} \\inside(\Gamma) &:= \Omega_1 = \Big\{ (x,y),\, \Phi(x,y) \ge 0 \Big\} \\outside(\Gamma) &:= \Omega_1 = \Big\{ (x,y),\, \Phi(x,y) < 0 \Big\} \end{align}

where $$\Gamma$$ is the interface, $$\Omega_1$$ is the first segment and $$\Omega_2$$ is the second segment. In other words, the boundary is defined by the zero isocurve (in the image the interception of the surface and the plane), while those positions (x,y) where the level-set function has a zero or positive value belong to the first segment (the part of the surface above the plane), and those positions (x,y) where the level-set function has a negative value belong to the second segment (the part of the surface below the plane). Using the implicit curve representation equation 1 is defined as:

$E( \Phi,\, \alpha_1,\, \alpha_2 ) = \int_{\Omega} \Big( \alpha | \nabla H( \Phi ) | - H( \Phi ) log \, p (I|\alpha_1) - H( \Phi ) log \, p(I|\alpha_2) \Big) d\vec{x}$

where $$\nabla$$ is the gradient operator  $$\Big[ \dfrac{\partial}{ \partial x} \, \dfrac{\partial}{\partial y} \Big]$$, $$H(\Phi)$$ is the Heaviside function and $$\delta (\Phi)$$ is the one dimensional Dirac measure as follows:

\left\{ \begin{align} 1, \,&\text{if } \Phi \ge 0 \\0, \, &\text{if } \Phi <0 \end{align} \right.

$H{\prime} (\Phi) := \delta (\Phi) = \dfrac{d}{d \Phi} H(\Phi)$

Keeping $$\alpha_1$$ and $$\alpha_2$$ fixed the corresponding Euler-Lagrange equation can be obtained and therefore the energy can be minimized by gradient descent as follows:

$\dfrac{\partial \Phi}{\partial t} = H{\prime}(\Phi) \Bigg( \alpha DIV \left( \dfrac{\nabla \Phi}{|\nabla \Phi|} \right) + log\, p(I|\alpha_1) - log\, p (I|\alpha_2) \Bigg)$

where DIV is the divergence operator. The first term minimizes the local curvature (e.g. boundary length) while the second and the third terms are the 'likehoods' of pixels belonging to the segments 1 and 2. Those pixels where $$\Phi(x,y) >= 0$$ belong to segment 1 as indicated by the Heaviside function. This leads to a two-stage algorithm:

Stage 1: approximate/resolve likelihood functions

Stage 2: solve the level-set function

### Several Segments

So far we have seen how to segment an input image into two segments. There are, at least, two different possibilities of segmenting an input image into several meaningful segments. (1) successively keep on segmenting the formed segments until the energy of the boundary out weights the splitting of an segment into 2 or (2) directly search for meaningful segment until the whole image has been segmented. The latter approach is used in my article Hypothesis-Forming-Validation-Loops.

## Example(s)

In the following there is an example of segmentation based on the disparity map, using the algorithm explained in the paper Hypothesis-Forming-Validation-Loops. Test images (i.e left- and right stereo images) have been provided by prof. Mårten Björkman from KTH.

Author: Jarno Ralli
Jarno Ralli is a computer vision scientist and a programmer.