BAMF-SLAM

Abstract

We present BAMF-SLAM, a novel multi-fisheye visual-inertial SLAM system that utilizes Bundle Adjustment (BA) and recurrent field transforms (RFT) to achieve accurate and robust state estimation in challenging scenarios. First, our system directly operates on raw fisheye images, enabling us to fully exploit the wide Field-of-View (FoV) of fisheye cameras. Second, to overcome the low-texture challenge, we explore the tightly-coupled integration of multicamera inputs and complementary inertial measurements via a unified factor graph and jointly optimize the poses and dense depth maps. Third, for global consistency, the wide FoV of the fisheye camera allows the system to find more potential loop closures, and powered by the broad convergence basin of RFT, our system can perform very wide baseline loop closing with little overlap. Furthermore, we introduce a semi-posegraph BA method to avoid the expensive full global BA. By combining relative pose factors with loop closure factors, the global states can be adjusted efficiently with modest memory footprint while maintaining high accuracy. Evaluations on TUM-VI, Hilti-Oxford and Newer College datasets show the superior performance of the proposed system over prior works. In the Hilti SLAM Challenge 2022, our VIO version achieves second place. In a subsequent submission, our complete system, including the global BA backend, outperforms the winning approach.

Method

BAMF-SLAM strives to estimate the state of the sensor platform as accurately and robustly as possible by utilizing the onboard multi-fisheye cameras and an IMU sensor. Following the design of modern SLAM frameworks, we divide our system into frontend and backend. The frontend is responsible for processing the input images and IMU measurements, determining which are keyframes, keeping track of the most recent keyframes in a sliding window, and performing alternative RFT and local BA optimization. On the backend, it manages a data buffer of all keyframes established by the frontend and converts the intermediate results of reprojection factors into relative pose factors to build a global pose graph. Finally, to improve overall accuracy and global consistency, the backend employs loop closure detection and global semi-pose-graph BA.

Results

TUM-VI Dataset

Qualitative results on TUM-VI dataset. As groundtruth only covers a single room, we transform results of Magistrale 1,2,4,5,6 and Slide 1,2,3 to the groundtruth frame for consistency check. Estimated trajectories are depicted in colors and point clouds are accumulated.

Hilti-Oxford Dataset

Newer College Dataset

BibTeX

@inproceedings{zhang2023bamf,
    title={BAMF-SLAM: Bundle adjusted multi-fisheye visual-inertial slam using recurrent field transforms},
    author={Zhang, Wei and Wang, Sen and Dong, Xingliang and Guo, Rongwei and Haala, Norbert},
    booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
    pages={6232--6238},
    year={2023},
    organization={IEEE}
}

BAMF-SLAM: Bundle Adjusted Multi-Fisheye Visual-Inertial SLAM Using Recurrent Field Transforms

ICRA 2023