Date of Award
2008
Degree Type
Thesis
Degree Name
Doctor of Philosophy
Program
Electrical and Computer Engineering
Supervisor
Dr. Jagath Samarabandu
Second Advisor
Dr. Olga Veksler
Third Advisor
Dr. Yuri Boykov
Abstract
Automatic scene analysis is an active research area and is useful in many applications such as robotics and automation, industrial manufacturing, architectural design and multimedia. 3D structural information is one of the most important cues for scene analysis. In this thesis, we present a geometric labeling method to automatically extract rough 3D information from a single 2D image. Our method partitions an image scene into five geometric regions through labeling every image pixel as one of the five geometric classes (namely, “bottom”, “left ”, “center”, “right”, and “top” ). We formulate the geometric labeling problem as an energy minimization problem and optimize the energy with a graph cut based algorithm. In our energy function, we address the spatial consistency of the geometric labels in the scene while preserving discontinuities along image intensity edges. We also incorporate ordering constraints in our energy function. Ordering constraints specify the possible relative positional labels for neighbor pixels. For example, a pixel labeled as the “left” can not be the right of a pixel labeled as the “right” and a pixel labeled as the “bottom” can not be above a pixel labeled as the “top”. Ordering constraints arise naturally in a real scene. We observed that when ordering constraints are used, the commonly used graph-cut based «-expansion is more likely to get stuck in local minima. To overcome this, we developed new graph-cut moves which we call order-preserving moves. Unlike «-expansion which works for two labels in each move, order-preserving moves act on all labels. Although the global minimum is still not guaranteed, we will show that optimization with order-preserving moves is shown to perform significantly better than «-expansion. Experimental results show that it is possible to significantly increase the percentage of reasonably good labeling by promoting spatial consistency and incorporating ordering constraints. It is also shown that the order-preserving moves performs significantly better than the commonly used «-expansion when ordering constraints are used as there is a significantly improvement in computational efficiency and optimality while the improvement in accuracy of pixel labeling is also modest. in We also demonstrate the usefulness of the extracted 3D structure information of a scene in applications such as novel view generation, virtual scene walk-through, semantic segmentation, scene synthesis, and scene text extraction. We also show how we can apply this order-preserving moves for certain simple shape priors in graph-cut segmentation. Our geometric labeling method has the following main contributions: (i) We develop a new class of graph-cut moves called order-preserving moves, which performs significantly better than «-expansion when ordering constraints are used. (ii) We formulate the problem in a global optimization framework where we address the spatial consistency of labels in a scene by formulating an energy function which encourages spatial consistency between neighboring pixels while preserving discontinuities along image intensity edges. (iii) We incorporate relative ordering information about the labels in our energy function. (iv) We show that our ordering constraints can also be used in other applications such as object part segmentation. (v) We also show how the proposed order-preserving moves can be used for certain simple shape priors in graph-cut segmentation.
Recommended Citation
Liu, Xiaoqing, "Spatially Coherent Geometric Class Labeling of Images and Its Applications" (2008). Digitized Theses. 4562.
https://ir.lib.uwo.ca/digitizedtheses/4562