RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty
We consider the problem of simultaneously estimating a dense depth map and camera pose for a large set of images of an indoor scene. While classical SfM pipelines rely on a two-step approach where cameras are first estimated using a bundle adjustment in order to ground the ensuing multi-view stereo stage, both our poses and dense reconstructions are a direct output of an altered bundle adjuster...
To this end, we parametrize each depth map with a linear combination of a limited number of basis “depth-planes” predicted in a monocular fashion by a deep net. Using a