Abstract:
Many computer vision problems can be formulated as finding the best labeling configuration. If labelings satisfy Markov property then finding best labeling configuration becomes MRF (Markov Random Field)- MAP (Maximum A Priori Posteriori) inference problem. Which is the minimization of cost of assigning labels to individual pixels and cost of assigning labelings to a collection of pixels (cliques). If we assume clique costs (or clique potentials) to be submodular then MRF-MAP inference becomes minimization of sum of submodular functions and can be done in polynomial time. Standard way to minimize a submodular function is by minimizing an equivalent dual objective defined on submodular polyhendron. As the first part of thesis in chapter 3, we develop an efficient inference algorithm for 2 label MRF-MAP problem named SoS-MNP. We show that the dual problem can be decomposed over cliques which enables the efficient optimization of dual in block co-ordinate descent (BCD) style. In our experiments we look at the image segmentation problem with clique size as large as 1000. We show that SoSMNP is very efficient and scalable to large cliques as compared to state of the art methods which scales only upto clique size of 16. In the second part of thesis in chapter 4, we develop the inference algorithm for 2-label MRF-MAP problem with a mix of small and large cliques. In such a configuration there are large number of small cliques which makes BCD style SoSMNP to be very slow. On the other hand there are other state of the art methods like Generic Cuts (GC) which run very fast for the problems with large number of small cliques but do not scale for the problems with large cliques. To overcome the limitations of both the algorithms we run GC for small cliques and
SoS-MNP for large cliques in BCD style by proposing a mapping between the variables of SoS-MNP and GC. Even after this mapping the hybrid algorithm does not give optimal result because GC minimizes `1-norm and SOS-MNP minimizes `2-norm of the objective function. We propose a recursive method which calls GC multiple times to output `2-norm solution. In experiments we show that the quality of the pixelwise image segmentation results improve if we use both the small and large cliques as compared to if we use only large cliques. We also demonstrate the efficiency of the proposed hybrid method over SoSMNP on the configuration with small and large cliques. As the third and last part of thesis in chapter 5, we develop an inferencne algorithm for multi label MRF-MAP problems. The current state of the art methods only run for the configuration of cliques with size upto 4 and 4 labels only. Standard way to solve multi-label problem is by converting it into 2-label problem by some encoding, such encoding introduces many extra states. We show that there is enough structure in the submodular polyhendron of the encoded clique potentials which can be exploited. We propose an efficient hybrid method (Hybrid-ML) which avoids computation over the extra states. In our experiments we show that using an MRF with clique size 800 can improve the results obtained by state of the art deep learning network Segnet on pixel-wise multi-object segmentation results. We also run Hybrid-ML on a stereo correspondence problem with clique size 100 and 16 labels. We also compare the running time of Hybrid-ML with SoSMNP and show a huge improvement in terms of efficiency.