[Paper Review] ElasticFusion: Dense SLAM Without A Pose Graph (RSS 2015) (Writing…)

ElasticFusion: Dense SLAM Without A Pose Graph

Thomas Whelan et al. (Dyson Robotics Laboratory at Imperial College London)


We present a novel approach to real-time dense visual SLAM. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale envi-ronments explored using an RGB-D camera in an incremental online fashion, without pose graph optimisation or any post-processing steps. This is accomplished by using dense frame-to-model camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimisations as often as possible to stay close to the mode of the map distribution, while utilising global loop closure to recover from arbitrary drift and maintain global consistency.


Existing dense SLAM methods suitable for iincremental, real-time operation struggle when the sensor makes movements which are both of

  • extended duration and often
  • criss-cross loop back on themselves (painting motion)

SLAM algorithms have too often targeted one of two extrems:

  • either extremely loopy motion in a very small area
  • corridor like motion on much larger scales but with fewer loop closures

이러한 SLAM Scenario들은 Dense vision frontend을 가진 SLAM에서는 문제가 될 수 있다!

Dense vision frontend -> No. of points matched is much hier than feature-based systems:
Makes joint filtering or bundle adjustment computationally infeasible:
Dense frontend have relied on alternation and effectively per-surface-element-independent filtering

Enormous weight of data serves to overpower the approximations to joint filtering which this assumes
This also raises the question as to whether it is optimal to attach a dense frontend to a sparse pose graph structure like its feature-based visual SLAM counterpart

Pose graph SLAM systems primarily focus on: Optimising the camera trajectory
Whereas our approach (utilising a deformation graph): Focus on optimising the map

기존의 Pose graph system은 Pose estimation (trajectory)에 집중한 반면, 여기선 Map optimisation에 집중한다

Utilising pose graphs

  • Whelan et al., parameteries a non-rigid surface deformation with an optimised pose graph to perform occasional loop closures in corridor-like trajectories
    • Scale well, but perform poorly given locally loopy trajectories while being unable to re-use revisited areas of the map
  • DVO SLAM, keyfromae-based pose graph optimisation
    • Performs no explicit map reconstruction and functions off of raw keyframes alone
  • Meilland and Comport, unified keyframes: utilises fused predicted 2.5D keyframes of mapped environtments while employing pose graph optimization to close large loops and align keyframes
    • not creating an explicit continuous 3D surface
  • MRSMap, registers octree encoded surfel maps together for pose estimation
    • After pose graph optimisation the final map is created by merging key surfel views

In our system

  • Move away from the focus of pose graph originally grounded in sparse methods
  • Move towards a more map-centric approach
  • Similar to Zhou et al, but runs in real time
  • Attempt to apply surface loop closure optimisations early and often
    • Always stay near to the mode of the map distribution
      • Allow to employ a non-rigid space deformation of the map using a sparse deformation graph embedded in the surface itself
      • Rather than a probabilistic pose graph which rigidly transforming independent keyframes

At the time of writing, this approach is the first of its kind of

  • Use photometric and geometric frame-to-model predictive tracking in a fused surfel-based dense map
  • Perform dense model-to-model local surface loop closures with a non-rigid space deformation
  • Utilise a predicted surface appearance based place recognition method

Approach overview

Significant usage of GPU

  • CUDA to implement tracking
  • OpenGL Shading Language for view prediction and map management

Key elements

  • Estimate a fused surfel-based model of the environment
    • Inspired by the surfel-based fusion system of Keller et al.
  • Segment older parts of the map which have not been observed into the inactive area
    • not used for tracking or data fusion
  • Every frame attempt to register the portion of the active model within the current estimated camera frame

© 2020. All rights reserved.

Powered by Hydejack v9.0.5