Scale Consistent RGB Predicted Depth SLAM with Applications to Augmented Reality

Date
2023-01-03
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Simultaneous Localization and Mapping (SLAM) based on visual sensors is a key technology for autonomous navigation and Augmented/Virtual Reality. The fundamental problem involves mapping of an unknown environment from sparse/dense depths, and concurrently localizing the sensor in the environment based on the map built from a sequence of images. Nevertheless, estimation of depth from a single camera is an ill-posed problem, which is why monocular SLAM systems often suffer from heavy scale drifts in long sequences, especially in the absence of loops for correction. With the rapid advancement in deep learning, neural networks have demonstrated noteworthy performance in single image depth estimation (SIDE). However, most unsupervised SIDE methods predict inconsistent depths over a video sequence, thereby limiting their application in SLAM systems. This thesis proposes an additional loss term that takes the spatial geometry of the scene into account during training to further constrain the network to produce scale consistent predictions over a sequence of images in an unsupervised manner. As the neural network learns from unlabeled monocular images, the predictions still suffer from per-frame uncertainty. Thus, the proposed SLAM approach (RGB-PD SLAM) using the CNN-based depth maps incorporates a novel optimization step using sparse bundle adjustment (SBA) to deal with the noises in the predictions. The potency of the proposed SLAM approach is validated through a simple AR application.

Description
Keywords
SLAM, Augmented Reality, Depth Prediction
Citation
Nawal, M. F. (2023). Scale Consistent RGB Predicted Depth SLAM with Applications to Augmented Reality (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.