The goal of this project is to extract the 3rd dimension of a scene using stereo vision. Stereo vision uses two images taken from different angles to obtain the lost dimension during photography.
Many approaches proposed for this problem. In this project two of them have been implemented: coarse to fine block matching and dynamic programming. Most of the stereo vision approaches have a comparison algorithm that estimate the similarity of two small regions, one from the left image and the other from the right image. The common comparison algorithm is sum of squared differences for each pixel of those regions. In this project two new comparison algorithms are proposed, similarity estimation in striped binary images (SESBI) and binary subtraction.
In the first experiment a coarse to fine block matching approach with SESBI comparison has been taken. This approach increases the resolution of the solution step by step. In the first step, the solution has four points that each one represents a region in the image. In the next step, each region from previous step is divided into four smaller regions. Then the resolution of the solution is multiplied by 4 in each step. Figure 1 shows the original image and the result of the 3rd step.
Figure 1: left to right: original image, extracted mesh, extracted mesh with mapping
In the second experiment dynamic programming is used. to evaluate the comparison algorithms proposed, the disparity map of the middlebury dataset is extracted. Figure 2 shows the original image, the ground truth and the three result for the three comparison algorithms.
Figure 2: up: left to right: original image, ground truth; down: left to right: output of dynamic programming with sum of squared differences, with binary subtraction, with SESBI
After evaluating the algorithms, the disparity map of our aerial image dataset is extracted. Figure 3 shows the original image and the disparity map.
Figure 3: left to right: original aerial image, extracted disparity map